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• NOVEL LIPASE GENES 

CROSS-REFERENCES TO RELATED APPLICATIONS 

Pursuant to 35 USC § 119(e), this application claims priority to and benefit of 

U.S. Provisional Patent Application Serial Nos. 60/217954, filed on July 13, 2000, and 
60/300378, filed on June 21, 2001, the disclosures of each of which is incorporated herein in 
5 their entirety for all purposes. 

FIELD OF THE INVENTION 

The present invention relates to the generation of novel lipase genes and 

homologues and to methods of recombination to produce novel lipase genes. 

COPYRIGHT NOTIFICATION 

10 Pursuant to 37 C.F.R. § 1.71(e), Applicants note that a portion of this 

disclosure contains material which is subject to copyright protection. The copyright owner 
has no objection to the facsimile reproduction by anyone of the patent document or patent 
disclosure, as it appears in the Patent and Trademark Office patent file or records, but 
otherwise reserves all copyright rights whatsoever. 

15 BACKGROUND OF THE INVENTION 

Lipases are enzymes which are involved in the breakdown of fats. Lipases are 

commercially important enzymes which have many current uses, including as reagents in 

food preparation processes (e.g., as additives to animal feeds), industrial degradative 

processes, crop engineering and even as treatments for several human diseases (e.g., 
20 indigestion and heartburn (e.g., for pancreatic insufficiency), secondary cystic fibrosis. 

Celiac disease, Crohn's disease, obesity, etc.). The activities and sequences of several 

hundred lipases are known. See^ e.g., www.led.uni-stuttgartde/. 

Because lipase enzymes are of considerable commercial value, the 

identification and development of new lipase enzymes is desirable. Hie present invention 
25 relates to new lipase proteins and nucleic acids, e.g., having novel sequences and activities, 

as well as variants thereof. 
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SUMMARY OF THE INVENTION 

The invention provides lipase polypeptides, nucleic acids encoding the 

polypeptides, antibodies to the polypeptides, and uses therefor, data sets containing character 
strings of lipase homologue sequences and automated systems for using the character strings 
5 as well as other functions that will be apparent upon further review. The present invention 
also provides methods of producing modified lipase polypeptides. 

Various aspects of the current invention comprise an isolated or recombinant 
polypeptide comprising a sequence having at least 97% amino acid sequence identity to any 
one of SEQ ID NO: 75 to SEQ K) NO: 108. Such polypeptide can optionally comprise or 

10 exhibit lipase activity (e.g., it can degrade geranyl butyrate or neryl butyrate or both). 

Additionally, such polypeptide can exhibit enantioselectivity for geranyl butyrate over neryl 
butyrate. Such polypeptide that exhibits enantioselectivity for geranyl butyrate can comprise 
a sequence selected from: SEQ ID NO;76, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO;86, 
SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:101, SEQ ID NO:102, SEQ ID NO:104, SEQ 

15 ID NO:107, SEQ ID NO:108, SEQ ID NO:78, SEQ ID NO:87, SEQ ID NO:100, SEQ ID 
NO:75, SEQ ID NO:77, SEQ ID NO:88, SEQ ID NO:98, SEQ ID NO:99, SEQ ID NO:103, 
or SEQ ID NO: 106. Alternatively, the polypeptide can exhibit enantioselectivity for neryl 
butyrate over geranyl butyrate. Such polypeptide that exhibits enantioselectivity for neryl 
butyrate over geranyl butyrate can comprise a sequence selected from: SEQ ID NO:81, SEQ 

20 ID NO:82, SEQ ID NO:83, SEQ ID NO:85, SEQ ID NO:89, SEQ ID NO:90, SEQ ID 

NO:94, SEQ ID NO:95, SEQ ID NO:105, SEQ ID NO:84, SEQ ID NO:91, SEQ ID NO:92, 
orSEQIDNO:93. 

Furthermore, the polypeptide can comprise a polypeptide encoded by a 
polynucleotide sequence which hybridizes under highly stringent conditions over 

25 substantially the entire length of a polynucleotide sequence selected from SEQ ID NO: 1-54 
(or a complementary sequence thereof), or by a polynucleotide sequence encoding a 
polypeptide sequence selected from SEQ ID NO: 55-108 (or a complementary sequence 
thereof), and wherein the polypeptide comprises one or more of: Lys at position 1; Thr at 
position 14; Ser at position 17; Arg at position 22; Glu at position 26; Pro at position 31; Gly 

30 at position 33; Glu at position 34; Pro at position 35; Pro or TTir at position 37; Ser or Lys at 
position 41; Gly at position 42; Arg or Glu at position 43; Ala at position 61; Tyr at position 
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75; Gly at position 96; Ser at position 97; Thr at position 104; Ser at position 107; Ala at 
position 125; Gly at position 129; Val at position 134; Cys at position 138; Lys at position 
141; Lys at position 146; Thr at position 156; Met at position 160; Arg at position 166; or His 
at position 177. Alternatively, the polypeptide can comprise one or more of: Lys at position 
5 1; Thr at position 14; Ser at position 17; Arg at position 22; Glu at position 26; Pro at 

position 31; Gly at position 33; Glu at position 34; Pro at position 35; Pro or Thr at position 
37; Ser or Lys at position 41; Gly at position 42; Arg or Glu at position 43; Ala at position 
61; Tyr at position 75; Gly at position 96; Ser at position 97; Thr at position 104; Ser at 
position 107; Ala at position 125; Gly at position 129; Val at position 134; Cys at position 

10 138; Lys at jjosition 141; Lys at position 146; Thr at position 156; Met at position 160; Arg at 
position 166; or His at position 177. 

Such polypeptide can comprise or exhibit lipase activity or the ability to 
degrade geranyl butyrate, neryl butyrate, or both neryl and geranyl butyrate. The polypeptide 
can also exhibit enantioselectivity for geranyl butyrate over neryl butyrate. A polypeptide 

15 exhibiting enantioselectivity for geranyl butyrate over neryl butyrate can comprise one or 
more of: Arg at position 22; Gly at position 33; Ser or Lys at position 41; Arg at position 43; 
Ser at position 107; Lys at position 141; Lys at position 146; Met at position 160; or His at 
position 177, or can comprise one or more of: Arg at position 43; or Ser at position 107. 

Such polypeptide can alternatively comprise or exhibit enantioselectivity for 

20 neryl butyrate over geranyl butyrate. Such polypeptide can comprise one or more of: Ser at 
position 17; Arg at position 22; Pro at position 31; Gly at position 33; Ser or Lys at position 
41; Lys at position 141; Lys at position 146; Met at position 160; Arg at position 166; or His 
at position 177, or, can comprise one or more of: Ser at position 17; Pro at position 31; or 
Arg at position 166. 

25 In another aspect, the invention can comprise an isolated or recombinant 

polypeptide comprising a sequence having at least 94% amino acid sequence identity to the 
mature region of SEQ ID NO: 55, 61, 64, 65, 67, 68, 70, or 72. Alternatively^ such 
polypeptide can comprise a sequence having at least 94% amino acid sequence identity to tiie 
mature region of SEQ ID NO: 55, e.g., the polypeptide can comprise a sequence selected 

30 from SEQ ID NO: 55, 58-62, 75-78, 80-88, or 94-108 (or the mature region thereof). 
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Alternatively, the polypeptide can comprise a sequence having at least 94% amino acid 
sequence identity to the mature region of SEQ ID NO: 61, which polypeptide, e.g., can 
comprise a sequence selected jBrom SEQ ID NO: 55, 57-62, 75-78, 80-90, or 93-108. 
Alternatively, the polypeptide can comprise a sequence having at least 94% amino acid 
5 sequence identity to the mature region of SEQ ID NO: 64, which polypeptide, e.g., can 

comprise a sequence selected from SEQ ID NO: 64, 71, or 72 (or the mature region thereof). 
Alternatively, the polypeptide can comprise a sequence having at least 94% amino acid 
sequence identity to the mature region of SEQ ID NO: 65, which polypeptide, e.g., can 
comprise a sequence selected from SEQ ID NO: 65, 66, or 73 (or a mature region thereof). 

10 Alternatively, the polypeptide can comprise a sequence having at least 94% amino acid 
sequence identity to the mature region of SEQ ID NO: 67, which polypeptide, e.g., can 
comprise the sequence SEQ ID NO: 67 (or the mature region thereof). Alternatively, the 
polypeptide can comprise a sequence having at least 94% amino acid sequence identity to the 
mature region of SEQ. ID NO: 68, which polypeptide, e.g., can comprise a sequence selected 

15 from SEQ ID NO: 68 or 101 (or the mature region thereof). Alternatively, the polypeptide 
can comprise a sequence having at least 94% amino acid sequence identity to the mature 
region of SEQ ID NO: 70, which polypeptide, e.g., can comprise a sequence selected from 
SEQ ID NO: 63, 68-70, 82-83, 85-86, 96, or 101-102 (or the mature region thereof). 
Alternatively, the polypeptide can comprise a sequence having at least 94% amino acid 

20 sequence identity to the mature region of SEQ ID NO: 72, which polypeptide, e.g., can 
comprise a sequence selected from SEQ ID NO: 64, 71, or 72 (or a mature region thereof). 

In another aspect, the invention can comprise an isolated or recombinant 
polypeptide comprising a sequence having at least 85% amino add sequence identity to the 
mature region of SEQ ID NO: 74, which polypeptide, e.g., can comprise a sequence selected 

25 from SEQ ID NO: 63, 71-72, 74, or 79 (or a mature region fliereoO. 

In yet another aspect, the invention can comprise an isolated or recombinant 
polypeptide comprising a sequence having at least 99% amino acid sequence identity to the 
mature region of SEQ ID NO: 56. 
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In Other aspects, such isolated or recombinant polypeptide comprises aii 
amino acid sequence of any one of SEQ ID NO: 55 through SEQ ID NO: 108 over a 
comparison window of at least 45 contiguous amino acids. 

In some embodiments, the invention comprises an isolated or recombinant 
5 polypeptide that is at least 45 contiguous amino acid residues of a polypeptide encoded by a 
coding polynucleotide sequence wherein the polynucleotide sequence is selected from: a 
polynucleotide sequence from any of SEQ ID NO: 1 to SEQ ID NO: 54, a polynucleotide 
sequence that encodes a polypeptide selected from any of SEQ ID NO: 55 through SEQ ID 
NO: 108; or a polynucleotide sequence that hybridizes under stringent conditions over 

10 substantially the entire length of one of the above polynucleotide sequences or which 

hybridizes to a subsequence comprising at least about 100 nucleic acids, provided that the 
polynucleotide does not correspond to GenBank accession numbers: 1I6WA, 1I6WB, 
A02813, A02815,A34992, AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, 
AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, BAA11406, 

15 BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196, CAA64621, 

CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, 
E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, and 
Z99108. 

Additionally, the invention provides such isolated or recombinant polypeptide 
20 wherein the polypeptide exhibits enantioselectivity for either a cis form enantiomer or a trans 
form enantiomer of a substrate and optionally wherein such enantioselectivity is represented 
by an enantiomeric ratio of at least 2 or more, at least 5 or more, at least 10 or more, at least 
50 or more, or at least 100 or more. 

In one embodiment, the invention, provides isolated or recombinant 
25 polypeptides encoded by a nucleic acid selected from any of the following: a polynucleotide 
sequence selected from SEQ ID NO: 1 to SEQ ID NO: 54 (or a complementary sequence 
thereof); a polynucleotide sequence encoding a polypeptide selected from SEQ ID NO: 55 to 
SEQ ID NO: 108 (or a complementary polynucleotide sequence thereof); a polynucleotide 
sequence which hybridizes under highly stringent conditions over substantially the whole 
30 length of any of the previously described polynucleotides, or which hybridizes to a 
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subsequence of the same comprising at least 100 residues wherein the polynucleotide 
sequence does not comprise a sequence corresponding to any of GenBank accession 
numbers: 1I6WA, 1I6WB, A02813, A02815A34992, AAA22574, AAB31769, AAC12257, 
AAD30278, AAF40217, AAF63229, AB000617, AF134840, AF141874, AF237623, 
5 AJ297356, BAA11406, BAA22231, BAB05967, C69652, CAA00273, CAA00274, 
CAA02196, CAA64621. CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, 
D78508, E01340, E01903, E02083, E05047, JW0068, M74010, P37957, S23934, U78785, 
X95309, Z99105, and Z99108; a polynucleotide sequence which comprises all, or a fragment 
of, any of the above described polynucleotides and which encodes a polypeptide comprising 

10 lipase activity; or a polynucleotide sequence encoding a polypeptide which comprises an 
amino acid sequence that is substantially identical over at least 45 contiguous amino acid 
residues of any one of SEQ ID NO: 55 to SEQ ID NO: 108 wherein the polynucleotide 
sequence does not comprise a sequence corresponding to any of GenBank accession 
numbers: 1I6WA, 1I6WB, A02813, A02815 A34992, AAA22574, AAB31769, AAC12257, 

15 AAD30278, AAF40217, AAF63229, AB000617, AF134840, AF141874, AF237623, 
AJ297356, BAA11406, BAA22231, BAB05967, C69652, CAA00273, CAA00274, 
CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, 
D78508, E01340, E01903, E02083, E05047, JW0068, M74010, P37957, S23934, U78785, 
X95309, Z99105, and Z99108. Additionally, such polynucleotide as is produced by mutating 

20 or recombining one or more of the above described polynucleotide sequences, is provided. 
The invention also provides an isolated or recombinant polypeptide as described above which 
comprises an amino acid sequence of any of SEQ ID NO: 55 to SEQ ID NO; 108. 

In other aspects, the invention includes, isolated or recombinant polypeptides 
(as described above) which can optionally exhibit: lipase activity (e.g., with respect to 

25 tributyrin, with respect to tributyrin in DMF, with respect to tributyrin after heat treatment 
(i.e., after the polypeptide has been heat treated); or enantioselective lipase activity (e.g., with 
respect to neryl-butyrate or geranyl- butyrate). Optionally, such polypeptides can comprise 
lipase activity against novel substrates (i.e., substrates upon which typical wild-type lipases 
do not act) such as, e.g., methyl esters, pentadecanolide, or oxacyclotridecan. Additionally, 

30 such polypeptides optionally are substantially identical over at least 45, at least 50, at least 


6 


wo 02/06457 


PCTAJSOl/22160 


75, at least 100, at least 125 , at least 150, at least 175, or at least 200 contiguous amino acids 
of any of the above described polypeptides with the proviso that the sequence does not 
comprise a sequence corresponding to any of GenBank accession numbers: 1I6WA, 1I6WB, 
A02813, A02815A34992, AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, 
5 AAF63229, AB000617, AF134840. AF141874, AF237623, AJ297356, BAA11406, 
BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196, CAA64621, 
CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, 
E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, and 
Z99108. Alternatively, such polypeptide is substantially identical over at least 180, at least 

10 212, at least 213, or at least 215 contiguous amino acid residues of an above described 
polypeptide, again with the proviso that the sequence does not comprise a sequence 
corresponding to any of GenBank accession numbers: 1I6WA, 1I6WB, A028i3, 
A02815,A34992, AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, 
AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, BAA11406, 

15 BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196, CAA64621, 

CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, D78508, B01340, E01903, 
E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, and 
Z99108. 

In various embodiments, the above described polypeptides further comprise 
20 one or more of: a leader sequence, a precursor polypeptide, a secretion signal or a 

localization signal, an epitope tag, a fusion protein comprising one or more additional amino 
add sequences, a polypeptide purification subsequence (e.g., an epitope tag, a FLAG tag, a 
polyhistidine sequence, a GST fusion), an N-terminus methionine residue, or a modified 
.amino acid (e.g., a glycosylated amino acid, a PEGylated amino acid, a famesylated amino 
25 acid, an acetylated amino acid, a biotinylated amino acid, an amino acid conjugated to a lipid 
moiety or to an organic derivatizing agent). 

Other aspects of the invration include, a composition of one or more modified 
amino acid polypeptide and a pharmaceutically acceptable excipient and/or a composition 
comprising one or more polypeptide of the invention with a surfactant (or with another 
30 component of a cleaning solution such as a builder, a polymer, a bleach system, a structurant. 
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a pH adjuster, a humectant, or a neutral inorganic salt) or a pharmaceutically acceptable 
excipient 

Additionally, a polypeptide which comprises a unique subsequence selected 
from SEQ ID NO: 55 through SEQ ID NO: 108 which is unique as compared to a 
5 polypeptide sequence corresponding to an amino acid sequence (or which is encoded by a 
nucleic acid sequence) corresponding to any of GenBank accession numbers 1I6WA, 
1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, AAC12257, AAD30278, 
AAF40217, AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, 
BAA11406, BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196, 

10 CAA64621, CAB12064, CAB 12664, CAB51971, CAB92662, CAB95850, D78508, E01340, 
E01903, E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, 
and Z99108 is provided. Other aspects include a polypeptide which is specifically bound by 
a polyclonal antisera raised against at least one antigen comprising at least one amino acid 
sequence from SEQ ID NO: 55 to SEQ ID NO: 108 (or a fragment thereof) where the 

15 antisera is subtracted with a polypeptide conesporiding to an amino acid sequence (or which 
is encoded by a nucleic acid sequence) corresponding to any of the above listed GenBank 
accession numbers. 

In other aspects the invention includes an antibody or antisera produced by 
administering a polypeptide of the invention to a manunal and wherein the antibody or 

20 antisera specifically binds at least one antigen which comprises a polypeptide sequence (or 
fragment thereof) from SEQ ID NO: 55 to SEQ ID NO: 108 and which antibody or antisera 
does not specifically bind to a polypeptide encoded by a nucleic acid corresponding to, or an 
amino acid sequence corresponding to one or more of the above listed GenBank accession 
numbers. 

25 In yet other aspects, the invention includes an antibody or antisera that 

specifically binds a polypeptide comprising an amino acid sequence (or fragment thereof) 
from SEQ ID NO: 55 to SEQ ID NO: 108 and which antibody or antisera does not 
specifically bind to a peptide encoded by a nucleic add corresponding to, or an amiiio add 
sequence corresponding to, one or more of the above listed GenBank accession nimibers. 
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The invention also includes a nucleic acid comprising a sequence selected 
from: a polynucleotide sequence selected from SEQ ID NO: 1 to SEQ ID NO; 54 (or a 
complementary sequence thereof; a polynucleotide sequence encoding a polypeptide selected 
from SEQ ID NO: 55 to SEQ ID NO: 108 (or a complementary sequence thereof); a 
5 polynucleotide sequence which hybridizes under highly stringent conditions over 

substantially the entire length of such sequences or which hybridizes to a subsequence 
thereof of at least 100 residues, provided that the polynucleotide sequence does not 
correspond to or encode any of GenBank accession numbers 1I6WA, 1I6WB, A02813, 
A02815^4992. AAA22574, AAB3I769, AAC12257, AAD30278, AAF40217, 

10 AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, B AAl 1406, 
BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196, CAA64621, 
CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, 
E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, and 
Z99108; and a polynucleotide sequence comprising all or a fragment of any of the previous 

15 polynucleotides and which comprises lipase activity and, again, which does not correspond to 
or encode of the above listed GenBank accession numbers. 

Other embodiments of the invention can comprise a nucleic acid which 
comprises a sequence which encodes a polypeptide having an amino acid sequence that is 
substantially identical over at least 45, at least 50, at least 75, at least 100, at least 125, at 

20 least 150, at least 175, or at least 200 contiguous amino acid residues of any of SEQ ID NO: 
55 to SEQ ID NO: 108, and, again, which does not correspond to or encode of the above 
listed GenBank accession numbers. Additionally, the invention provides nucleic acid which 
comprises a sequence encoding a polypeptide having a sequence that is substantially identical 
over at least 180, at least 212, at least 213, or at least 215 contiguous amino acid residues of 

25 any of SEQ ID NO: 55 to SEQ ID NO: 108, and, which does not correspond to or encode of 
the above listed GenBank accession numbers, 

Fuithennore, the invention optionally provides such nucleic acids wherein the 
encoded polypeptide can optionally exhibit: lipase activity (e.g., against tributyrin, against 
tributyrin in DMF (dimethyl formandde), or against tributyrin after being heat treated (i.e., 

30 after the polypeptide has been heat treated); enantioselective lipase activity (e.g., against 
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neryl-butyrate and/or geranyl-butyrate). Optionally, such nucleic acids can encode 
polypeptides which comprise lipase activity against novel substrates (i.e., substrates upon 
which typical wild-type lipases do not act) such as, e.g., methyl esters, pentadecanolide, or 
oxacyclotridecan. The invention also includes nucleic acids that comprise polynucleotide 
5 sequences encoding polypeptides comprising lipase activity and which are produced by 
mutating or recombining one or more polynucleotide sequence as described above (and 
which optionally comprises lipase activity) and/or an enantioselective lipase activity, and 
which do not correspond to or encode GenBank accession numbers: 1I6WA, 1I6WB, 
A02813, A02815 A34992, AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, 

10 AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, BAA11406, 
BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA021%, CAA64621, 
CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, 
E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, and 
Z99108. The invention additionally provides any of the above described nucleic acids 

15 wherein the encoded polypeptide comprises: a leader sequence; a precursor peptide, an 
epitope tag sequence; or a fusion protein comprising one or more additional nucleic acid. 

A composition comprising two or more nucleic acids of the invention, as well 
as such compositions that comprise a library (e.g., of at least about 2, 5, 10, 50, or more 
nucleic acids) is also a feature of the invention. Such compositions are optionally produced 

20 by cleaving of one or more nucleic acid (e.g., by mechanical, chemical or enzymatic (e.g., a 
restriction endonuclease, an RNAse, a DNAse, etc.) means) of any of the above described 
nucleic acids. Compositions produced by incubating one or more of any of the above 
described polynucleotides in the presence of deoxyribonucleotide triphosphates and a nucleic 
acid polymerase (e.g., a thermostable polymerase) are also aspects of the cuirent invention. 

25 Additionally, the invention provides a cell conq)rising at least one nucleic acid as described 
above (or a cleaved or amplified firagment or product thereof), which cell optionally 
expresses a polypeptide encoded by the nucleic add. Vectors and/or expression vectors (e.g., 
plasmids, cosmids, phages, viruses, vims fragments, etc.) comprising any nucleic add of the 
invention, as well as any cell transduced by such vectors are also provided Compositions 

30 comprising any nucleic add of the invention and a surfactant (or with another component of 

10 
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a cleaning solution such as a builder, a polymer, a bleach system, a structurant, a pH adjuster, 
a humectant, or a neutral inorganic salt) and/or compositions comprising an excipient 
(optionally a pharmaceutically acceptable excipient) are also provided in the invention. 

In one aspect, the invention provides a nucleic acid which comprises a unique 
5 subsequence selected from SEQ ID NO:l to SEQ ID NO:54. The unique subsequence is 
unique as compared to a nucleic acid corresponding to any of the sequences represented, e.g., 
by GenBank accession numbers: 1I6WA, II6WB, A02813, A02815,A34992, AAA22574, 
AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617, AF134840, 
AF141874, AF237623, AJ297356, BAA11406, BAA22231, BAB05967, C69652, 

10 CAA00273, CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, 
CAB92662, CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, M74010, 
P37957, S23934, U78785, X95309, Z99105, and Z99108, or related sequences present in 
GenBank as of the filing of this application. Additionally, a target nucleic acid which 
hybridizes under stringent conditions to a imique coding oligonucleotide which encodes a 

15 unique subsequence in a polypeptide selected from SEQ ID NO: 55 to SEQ ID NO: 108, 
wherein the unique subsequence is unique as compared to an amino acid sequence or to a 
polypeptide encode by a nucleic acid sequence corresponding to any of the above GenBank 
accession numbers is also provided in the invention. Furthermore, in some embodiments the 
stringent conditions are selected such that a perfectly complementary oligonucleotide to the 

20 coding oligonucleotide hybridizes to the coding oligonucleotide with at least a 5x higher 

signal to noise ratio than for hybridization of the perfectly complementary oligonucleotide to 
a control nucleic acid corresponding to any of the above GenBank accession numbers and 
wherein the target nucleic acid hybridizes to the unique coding oligonucleotide with at least 
about a 2x higher signal to noise ratio as compared to hybridization of the control nucleic 

25 acid to the coding oligonucleotide. 

In some embodiments, the current invention provides a database of one or 
more character strings corresponding to sequences selected from SEQ DD NO: 1 to SEQ ID 
NO: 108. Such database optionally comprises one or more character string recorded in a 
computer readable mediima (e.g., internal or extemal to a computer). The invention also 

30 provides: a method for manipulating a sequence record in a computer system by reading a 
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character string (optionally selected by a user, e.g., from a database or inputted by the user 
into the computer system) corresponding to a sequence selected from SEQ ID NO: 1 to SEQ 
ID NO: 108 (or a subsequence thereof); performing an operation on the character string; and 
returning a result of the operation (optionally comprising transmitting the selected character 
5 string to an output device). The operations performed in such computer system optionally 
comprise any of the following: a local sequence comparison, a sequence alignment, a 
sequence identity or similarity search, a stmctural similarity search, a sequence identity or 
similarity determination, a structure determination, a nucleic acid motif determination, an 
amino acid motif determination, a hypothetical translation, a determination of a restriction 

10 map, a sequence recombination, or a BLAST determination. In some aspects the method can 
comprise: aligning the selected character string with one or more additional character 
strings corresponding to a polynucleotide or polypeptide sequence; translating one or more 
character strings from SEQ ID NO: 1 to SEQ ID NO: 54 into a character string 
corresponding to an amino acid sequence or translating a character string selected from SEQ 

15 ID NO: 55 to SEQ ID NO: 108, into a character string corresponding to a polynucleotide 
sequence; determining sequence identity or similarity between the selected character string 
and one or more additional character strings by evaluating codon usage (optionally 
determining optimal codon usage); and obtaining the result of the operation on a user output 
device (e.g., optionally selected from a display monitor, a printer, and an audio output). The 

20 method also comprises transmitting the character string to a device (e.g., an oligonucleotide 
synthesizer or peptide synthesizer) capable of producing a physical embodiment of the 
character string (e.g., a physical embodiment comprising a nucleic acid or polypeptide or 
peptide corresponding to a character string or a sub-portion thereof) 

In some embodiments the invention provides methods of producing modified 

25 or recombinant nucleic acids comprising mutating or recombiinng (including through 

recursive recombination) a nucleic acid of the invention (or a fragment thereof), as well as 
the modified or recombinant nucleic acids that are produced by such method. Optionally, the 
one or more additional nucleic add encodes a polypeptide comprising lipase activity and/or 
enantioselective lipase activity (or an amino add subsequence or fragment thereof). Hie 

30 recombination (e.g., recursive recombination) is optionally done in vitro or in vivo and 
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optionally produces at least one library of recombinant nucleic acids, which comprises at 
least one polypeptide comprising lipase activity and/or enantioselective lipase activity (or a 
homologue thereof). Both the nucleic acid library produced and a population of cells 
comprising the library are provided by the invention, as are the modified or recombinant 
5 nucleic acids produced by the mutation/recombination (and cells which comprise such 
nucleic acids). In some aspects, the invention also provides a method of producing a 
polypeptide by introducing a nucleic acid of the invention (or a fragment thereof), which is 
operably linked to a regulatory sequence capable of directing expression of such nucleic acid, 
into a population of cells and then expressing the polypeptide. Th& polypeptide produced 

10 from such method is also part of the current invention. Such method optionally includes 

isolating the polypeptide from the cells and optionally includes expressing the polypeptide by 
culturing the population in a nutrient medium under conditions where the regulatory 
sequence directs expression of the polypeptide (again, wherein the polypeptide is optionally 
isolated or recovered from the cells and/or from the nutrient media (such culturing is 

15 optionally done in a bulk fermentation vessel)). The cells used in such methods are 

optionally bacterial, eukaryotic (e.g., fungal cells, yeast cells, plant cells, insect ceDs, or 
mammalian cells (e.g., fertilized oocytes, embryonic stem cells, pluripotent stem cells, etc.)). 
If mammalian cells are utilized, a transgenic animal is optionally regenerated from ^e cells 
and the polypeptide is optionally recovered from the transgenic animal or from a by-product 

20 of the transgenic animal such as milk. 

In other aspects, the current invention provides methods/compositions for a 
cleaning solution (e.g., detergent) comprising the lipase homologues. Additional components 
(e.g., surfactants, proteolytic enzymes, humectants, neutral inorganic salts, sudsing agent, 
fragrance, structurants, etc.) can be included, individually, or multiply, in such compositions, 

25 In yet other aspects, the current invention provides methods to therapeutically 

or prophylactically treat a gastrointestinal lipid related condition/disease/disorder by 
hydrolyzing a lipid through expressing a polypeptide in a target ceU or contacting a target 
cell wifli an effective amount of polypeptide of the invention (or a fragment thereof) such 
target cell optionally is in culture or is within a subject to be treated. The current invention 

30 also provides a method of flierapeutic or prophylactic treatment of a gastrointestinal lipid 
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related condition/disease/disorder in a subject wherein the subject is administered a 
polypeptide of the invention in an amount effect to treat the condition/disease/disorder, 
including wherein the subject is a mammal (e.g., a human), and wherein the polypeptide is 
administered in vivo, in vitro, or ex vivo (or a combination of such) to one or more cells of 
5 the subject. Such polypeptides include compositions of the polypeptide and a 

pharmaceutically acceptable excipient, which is administered to a subject in an amount 
effective to treat a gastrointestinal lipid related condition/disease/disorder (e.g., cystic 
fibrosis, celiac disease, Crohn's disease, indigestion, and obesity 

Another provision of the invention is a method of hydrolyzing a Upid to 

10 therapeutically or prophylactically treat a gastrointestinal lipid related 

condition/disease/disorder by introducing into a target cell a nucleic acid of the invention, or 
a fragment thereof, which is operably hnked to a regulatory sequence active in the target cell 
such that introduction of the polynucleotide results in expression of the nucleic acid in an 
amount sufficient to hydrolyze the lipid. Such method optionally comprises direcdy 

15 administering the nucleic acid to a subject in an amount sufficient to introduce the nucleic 
acid into one or more cells. The subject optionally comprises a mammal (or a human) and 
the nucleic acid optionally comprises a vector. Yet another provision of the invention is a 
method of therapeutically or prophylactically treating a gastrointestinal lipid related 
condition/disease/disorder by expressing in a target cell (or contacting a target cell with an 

20 effective amount of) a polynucleotide of the invention, or a fragment thereof, or of a 

polypeptide encoded thereby (or a fragment thereof). Such method can include wherein the 
target cell is in culture or wherein the target cell is within a subject Additionally, the 
invention provides a method of therapeutically or prophylactically treating a gastrointestinal 
lipid related condition/disease/disorder in a subject by administering to the subject a 

25 polynucleotide of the invention (or a fragment thereof) or a polypeptide encoded thereby (or 
a fragment thereof) in an amount effective to treat the gastrointestinal lipid related 
condition/disease/disorder. Such method comprises optional embodiments wherein the 
subject is a mammal or a human and wherein the polynucleotide and/or polypeptide is 
administered in vivo, in vitro, or ex vivo (or a combination of such) to one or more cells of 

30 the subject and wherein a conqwsition of the polynucleotide and/or polypeptide and a 
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pharmaceutically acceptable excipient is administered to the subject in an amount effective to 
treat the gastrointestinal lipid related condition/disease/disorder (e.g., cystic fibrosis, celiac 
disease, Crohn's disease, indigestion, or obesity). 

BRIEF DESCRIPTION OF THE FIGURES 

Figure 1: Enantiomer Selectivity of Exemplary Lipase Homologues 

Figure 2: Enantiomeric Ratio for Exemplary Lipase Homologues. 
Figure 3a-3i: Alignment of Exemplary Novel lipase Polynucleotides (SEQ 
ID NO: 1-20). 

Figure 4a-4h: Alignment of Exemplary Novel Lipase Polynucleotides 

(SEQ ID NO: 21-54). 
Figure 5a-5c: Alignment of Exemplary Novel Lipase Polypeptides (SEQ ID 
NO: 55-74). 

Figure 6a-6c: Alignment of Exemplary Novel Lipase Polypeptides (SEQ 
ID NO: 75-108). 

15 DETAILED DESCRIPTION OF THE INVENTION 

DEFINmONS 

Unless otherwise defined herein or below in the remainder of the 
specification, all technical and scientific terms used herein have the same meaning as 
commonly understood by those of ordinary skill in the art to which the present invention 
20 belongs. 

A "polynucleotide sequence" is a nucleic acid (which is a polymer of 
nucleotides (A,C,T,U,G, etc. or naturally occurring nucleotide analogues, artificial nucleotide 
analogues, etc.) or a character string representing a nucleic acid, depending on context. 
Either the given nucleic acid or the complementary nucleic acid can be determined from any 
25 specified polynucleotide sequence. 

Similarly, an "amino acid sequence" is a polymer of amino acids (a protein, 
polypeptide, etc.) or a character string representing an amino acid polymer, depending on 
context 

15 
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A "subsequence" or **fragment" is any portion of an entire sequence, up to and 
including the complete sequence. 

"Substantially an entire length of a polynucleotide or amino acid sequence" 
refers to at least about 50%, at least about 60%, generally at least about 70%, generally at 

5 least about 80%, or typically at least about 90%, 95%, 96%, 97%, 98%, or 99% or more of a 
length of an amino acid sequence or nucleic acid sequence. 

Numbering of a given amino acid or nucleotide polymer "corresponds to 
numbering" of a selected amino acid polymer or nucleic acid when the position of any given 
polymer component (amino acid residue, incorporated nucleotide, etc.) is designated by 

10 reference to the same residue position in the selected amino acid or nucleotide, rather than by 
the actual position of the component in the given polymer. 

"Naturally occurring," as applied to an object, refers to the fact that the object 
can be found in nature. For example, a polypeptide or polynucleotide sequence that is 
present in an organism, including viruses, that can be isolated from a source in nature and 

15 which has not been intentionally modified by humankind in the laboratory is naturally 

occurring. In one aspect, a **naturally occurring" nucleic acid (e.g., DNA or RNA) molecule 
is a nucleic acid molecule that exists in the same state as it exists in nature; that is, the nucleic 
acid molecule is not isolated, recombinant, or cloned. 

A nucleic acid, protein, peptide, polypeptide, or other component is "isolated" 

20 when it is partially or completely separated from components with which it is normally 
associated (such as, other peptides, polypeptides, proteins (including complexes, e.g., 
polymerases and ribosomes which may accompany a native sequence), nucleic acids, cells, 
synthetic reagents, cellular contaminants, cellular components, etc.), e.g., such as from other 
components with which it is normally associated in the cell from which it was originally 

25 derived. A nucleic acid, polypeptide, or other component is substantially pure when it is 
partially or completely recovered or separated from other components of its natural 
environment such that it is the predominant species present in a composition, mixture, or 
collection of components (i.e., on a molar basis it is more abundant than any other individual 
species in the composition). In preferred embodiments, the preparation consists of more than 

30 70%, typically more than 80%, or preferably more than 90% of the isolated species. 
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In one aspect, a "substantially pure" or "isolated" nucleic acid (e.g., RNA or 
DNA), polypeptide, protein, or composition also means wherein the object species (e.g., 
nucleic acid or polypeptide) comprises at least about 50, 60, or 70 percent by weight (on a 
molar basis) of all macromolecular species present. A substantially pure or isolated 
5 composition can also comprise at least about 80, 90, 95, 96, 97, 98, or 99 or more percent by 
weight of all macromolecular species present in the composition. An isolated object species 
can also be purified to essential homogeneity (contaminant species cannot be detected in the 
composition by conventional detection methods) wherein the composition consists 
essentially of derivatives of a single macromolecular species. 

10 The term "isolated nucleic acid" can also refer to a nucleic acid (e.g., DNA or 

RNA) that is not immediately contiguous with both of the coding sequences with which it is 
immediately contiguous (i.e., one at the 5' and one at the 3' end) in the naturally occurring 
genome of the organism from which the nucleic acid of the invention is derived. Thus, this 
term includes, e.g., a cDNA or a genomic DNA fragment produced by polymerase chain 

15 reaction (PGR) or restriction endonuclease treatment, whether such cDNA or genomic DNA 
fragment is incoiporated into a vector, integrated into the genome of the same or a different 
species than the organism, including, e.g., a virus, from which it was originally derived, 
linked to an additional coding sequence to form a hybrid gene encoding a chimeric 
polypeptide, or independent of any other DNA sequences. The DNA may be double- 

20 stranded or single-stranded, sense or antisense. 

A nucleic acid or polypeptide is "recombinant" when it is artificial or 
engineered, or derived from an artificial or engineered protein or nucleic acid. The term 
"recombinant" when used with reference e.g., to a cell, nucleotide, vector, or polypeptide 
typically indicates that the cell, nucleotide, or vector has been modified by the introduction of 

25 a heterologous (or foreign) nucleic acid or the alteration of a native nucleic acid, or that the 
polypeptide has been modified by the introduction of a heterologous amino acid, or that the 
cell is derived from a ceU so modified. Recombinant ceUs express nucleic acid sequences 
(e.g., genes) that are not found in the native (non-recombinant) form of the cell or express 
native nucleic acid sequences (e.g., genes) that would be abnormally expressed, under- 

30 expressed, or not expressed at all. 
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The term "recombinant nucleic acid" (e.g., DNA or RNA) molecule means, 
for example, a nucleotide sequence that is not naturally occurring or is made by the 
combination (for example, artijHcial combination) of at least two segments of sequence that 
are not typically included together, not typically associated with one another, or are 
5 otherwise typically separated from one another. A recombinant nucleic acid can comprise a 
nucleic acid molecule formed by the joining together or combination of nucleic acid 
segments from different sources and/or artificially synthesized. The term "recombinantly 
produced'* refers to an artificial combination usually accomplished by either chemical 
synthesis means, recursive sequence recombination of nucleic acid segments or other 

10 diversity generation methods of nucleotides, or manipulation of isolated segments of nucleic 
acids, e.g., by genetic engineering techniques known to those of ordinary skiU in the art, 
"Recombinantly expressed" typically refers to techniques for the production of a 
recombinant nucleic acid in vitro and transfer of the recombinant nucleic acid into cells in 
vivo, in vitro, or ex vivo where it may be expressed or propagated. A "recombinant 

15 polypeptide" or "recombinant protein" usually refers to polypeptide or protein, respectively, 
that results from a cloned or recombinant gene or nucleic acid 

A 'Vector" is a composition for facilitating cell transduction by a selected 
nucleic acid, or expression of the nucleic acid in the cell. Vectors include, e.g., plasmids, 
cosmids, vimses, YACs, bacteria, poly-lysine, etc. An "expression vector" is a nucleic acid 

20 construct, generated recombinantly or synthetically, with a series of specific nucleic acid 
elements that permit transcription of a particular nucleic acid in a host cell. The expression 
vector can be part of a plasmid, virus, or nucleic acid fragment. The expression vector 
typically includes a nucleic acid to be transcribed operably linked to a promoter. 

Tlie term "homology" generally refers to the degree of similarity between two 

25 or more structures. The term 'Tiomologous sequences" refers to regions in macromolecules 
that have a similar order of monomers. When used in relation to nucleic acid sequences, the 
term **homology*' refers to the degree of similarity between two or more nucleic acid 
sequences (e.g., genes) or fragments fliereof. Typically, the degree of similarity between two 
or more nucleic acid sequences refers to the degree of similarity of the composition, order, or 

30 arrangement of two or more nucleotide bases (or other genotypic feature) of the two or more 
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nucleic acid sequences. The term "homologous nucleic acids" generally refers to nucleic 
acids comprising nucleotide sequences having a degree of similarity in nucleotide base 
composition, arrangement, or order. The two or more nucleic acids may be of the same or 
different species or group. The term "percent homology" when used in relation to nucleic 
5 acid sequences, refers generally to a percent degree of similarity between the nucleotide 
sequences of two or morc nucleic acids. 

When used in relation to polypeptide (or protein) sequences, the term 
"homology" refers to the degree of similarity between two or more polypeptide (or protein) 
sequences (e.g., genes) or fragments thereof. Typically, the degree of similarity between two 

10 or inore polypeptide (or protein) sequences refers to the degree of similarity of the 
composition, order, or arrangement of two or more amino acids of the two or morc 
polypeptides (or proteins). The two or morc polypeptides (or proteins) may be of the same or 
different species or group. The term "percent homolog/* when used in relation to 
polypeptide (or protein) sequences, refers generally to a percent degree of similarity between 

15 the amino acid sequences of two or more polypeptide (or protein) sequences. The term 
"homologous polypeptides" or "homologous proteins" generally refers to polypeptides or 
proteins, respectively, that have amino acid sequences and functions that are siufiilar. Such 
homologous pol)^eptides or proteins may be related by having amino acid sequences and 
'functions that are similar, but are derived from, or evolved from, different or the same 

20 species using the techniques described herein. 

The term "subject" as used herein includes, but is not limited to, an organism; 
manomal, including, e.g., human, non-human primate (e.g., monkey), mouse, pig, cow, goat, 
rabbit, rat, guinea pig, hamster, horse, monkey, sheep, or other non-human mammal; a non- 
manunal, including, e.g., a non-mammalian vertebrate, such as a bird (e.g., chicken or duck) 

25 or a fish; and a non-mammalian invertebrate. 

The term "pharmaceutical composition" means a composition suitable for 
pharmaceutical use in a subject, including an animal or human. A pharmaceutical 
* composition generally conoprises an effective amount of an active agent and a 
pharmaceutically acceptable carrier. 
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The term "eflfective amount'* means a dosage or amount sufficient to produce 
a desired result. The desired result may comprise an objective or subjective improvement in 
the recipient which receives the dosage or amount. 

A "prophylactic treatment" is a treatment administered to a subject who does 
5 not display signs or symptoms of a disease, pathology, or medical disorder, or displays only 
early signs or symptoms of a disease, pathology, or disorder, such that treatment is 
administered for the purpose of diminishing, preventing, or decreasing the risk of developing 
the disease, pathology, or medical disorder. A prophylactic treatment functions as a 
preventative treatment against a disease or disorder. A "prophylactic activity" is an activity 

10 of an agent, such as a nucleic acid, vector, gene, polypeptide, protein, substance, or 
composition thereof that, when administered to a subject who does not display signs or 
symptoms of pathology, disease or disorder, or who displays only early signs or symptoms of 
pathology, disease, or disorder, diminishes, prevents, or decreases the risk of the subject 
developing a pathology, disease, or disorder. A "prophylactically useful" agent or compound 

15 (e.g., nucleic acid or polypeptide) refers to an agent or compound that is useful in 
diminishing, preventing, treating, or decreasing development of pathology, disease or 
disorder. 

A "therapeutic treatment" is a treatment administered to a subject who 
displays symptoms or signs of pathology, disease, or disorder, in which treatment is 

20 administered to such subject for the purpose of diminishing or eliminating those signs or 
symptoms of pathology, disease, or disorder. A "therapeutic activity" is an activity of an 
agent, such as a nucleic acid, vector, gene, polypeptide, protein, substance, or composition 
thereof, that eliminates or diminishes signs or symptoms of pathology, disease or disorder, 
when administered to a subject suffering fix>m such signs or symptoms. A '^therapeutically 

25 useful" agent or compound (e.g., nucleic acid or polypeptide) indicates that an agent or 
compoimd is useful in diminishing, treating, or eliminating such signs or symptoms of a 
pathology, disease or disorder. 

The term "gene" broadly refers to any segment of DNA associated with a 
biological function. Genes include coding sequences and/or regulatory sequences required 

30 for their expression. Genes also include non-expressed DNA nucleic acid segments that, e.g., 
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form recognition sequences for other proteins (e.g., promoter, enhancer, or other regulatory 
regions). 

Generally, the nomenclature used herein, and the laboratory procedures in ceU 
culture, molecular genetics, molecular biology, nucleic acid chemistry, and protein chemistry 
5 described below, are those well known and commonly employed by those of ordinary skill in 
the art. Standard techniques, such as described in Sambrook et al.. Molecular Clojiing - A 
Laboratory Manual (2nd Ed.), Vols. 1-3, Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, 1989 (hereinafter "Sambrook") and Current Protocols in Molecular 
Biology, F. M. Ausubel et al., eds., Current Protocols, a joint venture between Greene 

10 Publishing Associates, Inc. and John Wiley & Sons, Inc. (supplemented through 2000) 
(hereinafter "Ausubel"), are used for recombinant nucleic acid methods, nucleic acid 
synthesis, cell culture methods, and transgene incorporation, e.g., electroporation, injection, 
and lipofection. Generally, oligonucleotide synthesis and purification steps are performed 
according to specifications. The techniques and procedures are generally performed 

15 according to conventional methods in the art and various general references which are 

provided throughout fliis document. The procedures herein are believed to be well known to 
those of ordinary skill in the ai^ and are provided for the convenience of the reader. 

As used herein, an "antibody" refers to a protein comprising one or more 
polypeptides substantially or partially encoded by immunoglobulin genes or fragments of 

20 immunoglobulin genes. The recognized inununoglobulin genes include the kappa, lambda, 
alpha, gamma, delta, epsilon and mu constant region genes, as well as myriad 
immunoglobulin variable region genes. Light chains are classiiHed as either kappa or lambda. 
Heavy chains are classified as gamma, mu, alpha, delta, or epsilon, which in turn define the 
immunoglobulin classes, IgG, IgM, IgA, IgD and IgE, respectively. A typical 

25 immunoglobulin (e.g., antibody) structural unit comprises a tetramer. Each tetramer is 
composed of two identical pairs of polypeptide chains, each pair having one "light" chain 
(about 25 kD) and one *1ieavy" chain (about 50-70 kD). Hie N-teiminus of each chain 
defines a variable region of about 100 to 110 or more amino acids primarily responsible for 
antigen recognition. The terms variable light chain (VL) and variable heavy chain (VH) refer 

30 to these light and heavy chains, respectively. Antibodies exist as intact inmiunoglobulins or 
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as a number of well characterized fragments produced by digestion with various peptidases. 
Thus, for example, pepsin digests an antibody below the disulfide linkages in the hinge 
region to produce F(ab)'2, a dimer of Fab which itself is a light chain joined to VH-CHl by a 
disulfide bond. The F(ab)*2 may be reduced under mild conditions to break the disulfide 
5 Hnkage in the hinge region thereby converting the (Fab*)2 dimer into an Fab* monomer. The 
Fab* monomer is essentially an Fab with part of the hinge region (see. Fundamental 
Immunology, W.E. Paul, ed.. Raven Press, N.Y. (1993), for a more detailed description of 
other antibody fi"agments). While various antibody fragments are defined in terms of the 
digestion of an intact antibody, one of skill will appreciate that such Fab* fragments may be 

10 synthesized de novo either chemically or by utilizing recombinant DNA methodology. Thus, 
the term antibody, as used herein also includes antibody fragments either produced by the 
modification of whole antibodies or synthesized de novo using recombinant DNA 
methodologies. Antibodies include single chain antibodies, including single chain Fv (sFv) 
antibodies in which a variable heavy and a variable light chain are joined together (directiy or 

15 through a peptide linker) to form a continuous polypeptide. 

The term "lipase activity" refers herein to the ability of a lipase enzyme to 
hydrolyze a lipid, oil, or fat molecule, detected by, for example, any of the lipase activity 
assays described herein or known to those having ordinary skiU in the art {see, e.g., 
EXAMPLE I and the references cited therein). 

20 **Enantioselective lipase activity" refers herein to the ability of a Upase 

enzyme to preferentially hydrolyze a specific enantiomer of a lipid, oil, or fat molecule, 
detected by, for example, any of the enantioselective lipase activity assays described herein 
(see, e.g., EXAMPLE n and the references cited therein). 

A "mature region" as used herein refers to the mature coding region of a 

25 polypeptide, i.e., it does not include the signal peptide coding region. For example. Figures 3 
and 5 depict the mature coding regions of exemplary lipases of the current invention. 

An "equivalent amino acid position" is defined herein as an amino acid 
position of a test polypeptide which aligns with an amino acid position of SEQ ID NO:75 
using an alignment algorithm as described herein. The equivalent amino acid position of the 

30 test polypeptide need not be the same as the linear amino add sequence position of the test 
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polypeptide. As an example, amino acid number 2 of the polypeptide SEQ ID NO:75 is 
considered to be the equivalent amino acid position to amino acid number 35 of the 
polypeptide SEQ ID NO:55 and to amino acid number 38 of SEQ ID NO:65, since amino 
add number 2 of SEQ ID NO:75 aligns with amino acid number 35 of SEQ ID NO:55 and 
5 with amino acid number 38 of SEQ ID NO:65 using an alignment algorithm described 
herein, e.g., the CLUSTALW alignment program using default parameters. Therefore, 
"amino acid position 2 or an equivalent position to that of SEQ ID NO:75" is meant to 
correspond, e.g., to amino acid 35 of SEQ ID NO:55, amino acid 38of SEQ ID NO:65, etc. 

A variety of additional terms are defined or otherwise characterized herein. 

10 POLYNUCLEOTIDES 

Novel lipase Sequences 

The invention provides isolated or recombinant lipase polypeptides and 
homologues thereof (optionally collectively referred to as lipase polypeptides), and isolated 
or recombinant polynucleotides encoding the polypeptides. 

15 Novel Lipase Molecules and Lipase Variants 

The present invention relates to the isolation of newly discovered lipase 

polynucleotides firom different strains of Bacillus as well as creation of novel lipase 

polynucleotides. A number of Bacillus species (both known Bacillus species and un-typed 

Bacillus species) were screened to identify lipase activity while in colonies. Plate screens 

20 were used to identify those colonies expressing lipase activity. See^ ''EXAMPLE T* below, 
and, e.g., Dartois, V. et al., "Cloning, nucleotide sequence and expression in Escherichia coli 
of a lipase gene from Bacillus subtilis 168," Biochimica et Biophvsica Acta 1131 (1992) 
253-260 and references cited therein, 

DNA from colonies which displayed lipase activity was used in PGR reactions 

25 with degenerate lipase primers designed to a known lipase gene from Bacillus subtilis. For 
reactions that did not readily produce amplified lipase genes, the DNA isolates were 
amplified using internal degenerative primers designed to anneal to more conserved regions, 
thus producing lipase gene fragments which were spliced into 5. subtilis to generate chimeric 
full-length genes. The techniques used for amplification, etc. are well known to those of skill 

30 in the art and references teaching such are replete herein. The lipase genes discovered 
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through this process (SEQ ID NO: 1 through SEQ ID NO: 20) correspond to lipase 
homologue polypeptides showh in SEQ ID NO: 55 through SEQ ID NO: 74. Novel lipase 
polynucleotides were isolated from cultures of B. pumilus, B, subtiliSy B. megaterium^ B, 
lentus, B. circulans, B, azotoformans, B.firmus, and B. badius {see, SEQ ID NO: 1 through 
5 SEQ ID NO: 8 and SEQ ID NO: 55 through SEQ ID NO: 62) as well as from undetermined 
Bacillus species (see, SEQ ID NO: 9 through SEQ ID NO: 20 and SEQ ID NO: 63 through 
SEQ ID NO: 74). See, Figures 3 and 5. 

The newly isolated Bacillus Upase polynucleotides were then recombined to 
create libraries of novel lipase homologues which were screened for lipase activity and 

10 enantioselectivity {see, 'EXAMPLE T* and infra). A number of homologues were chosen for 
further analysis (i.e., the novel Upase homologues of the invention). Methods and protocols 
for generation of nucleic acid libraries and of nucleic acid recombination are well known to 
those of skill in the art and can be found in numerous references cited herein. The nucleic 
acids for both the discovered Bacillus lipases and the newly created lipases were cloned into 

15 E, coll expression vectors, transformed in to E. colU and screened for lipase activity {see, 
below for screening). 

Sequences of the newly created lipase polynucleotides (i.e., those created 
through recombination of the newly isolated lipase genes) are shown in SEQ ID NO: 21 
through SEQ ID NO: 54 (with the corresponding amino acid sequences being SEQ ID NO: 

20 75 through SEQ ID NO: 108). It should be noted that the nucleic acid sequences of the 
created lipase homologues ( SEQ ID NO: 21 through SEQ ID NO: 54) are present in the 
sequence listing table herein with an introductory 5' T' and an ending 3' 'TGA,' used for, 
e.g., construction of vector attachment sites, etc. and which, in many embodiments of the 
invention, are optionally removed or are not present See, Figures 4 and 6 

25 The newly created lipase homologues of the invention (i.e., SEQ ID NO: 21 

through SEQ ID NO: 54 and SEQ ID NO: 75 tiirough SEQ ID NO: 108) were also examined 
for enantioselectivity. Enantioselectivity as used herein, refers to the preference of an 
enzyme (e.g., lipase) to selectively utilize one substrate enantiomer over another enantiomer. 
Enantiomers are stereoisomers that are non-superimposable mirror images of each otiier. For 

30 example, neryl-butyrate and geranyl-butyrate are enantiomers of one another. It will be 
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appreciated that while the screen was for enantioselectivity against neryl or geranyl butyrate, 
the novel lipase homologues herein optionally show lipase activity and/or enantioselective 
lipase activity against other substrates, e.g., neryl or geranyl acetate, other cis/trans Upids or 
Upid esters, etc. 

5 While enantiomers have the same basic structurc, they can vary in some 

specifics. For example the cis/trans enantiomers neryl- butyrate and geranyl- butyrate are 
used for different processes in the perfume/fragrance industry. Thus, enzymatic pathways 
that specifically produce one or the other (i.e., either neryl or geranyl butyrate) would be a 
welcome addition. Of course, myriad other enantiomers (both known and unknown) are also 

10 useful in numerous processes/applications and neryl/geranyl butyrate is only a non-Umiting 
example of possible enantiomeric substrates for the lipase homologues of the invention. 

The present invention also provides enantioselective Upases. 
Enantioselectivity can be readily determined as described below by comparing the 
conversion of such substrate enantiomers. For example, enantioselectivity was detected by 

15 growing clones expressing lipases of the present invention on media containing neryl- 
butyrate and geranyl- butyrate. The neryl- butyrate and geranyl- butyrate created a hazy 
appearance in the media on which the library constituents were grown. If an individual 
colony of a library produced active lipase (either secreted lipase or lipase from iysed cells) 
that utilized the neryl and/or geranyl butyrate in the media, it would break it down and clear 

20 that area of the plate. In other words, the colonies containing active Upase (which could 

breakdown the neryl- butyrate and/or geranyl- butyrate) produced a clear ring or halo around 
the colony. Such colonies were isolated and further analyzed to check for enantioselectivity. 
The protocol followed corresponded to that found in "SCREENING FOR ENZYME 
STEREOSELECTIVITY UTILIZING MASS SPECTROMETRY," by Davis et al.. Attorney 

25 Docket Number 02-109010US, USSN 60/278934 filed March 26, 2001. While all the 
sequences used to create the libraries (i.e., SEQ ID NO: 1-20 (nucleic acid) and SEQ ID 
NO: 55-74 (polypeptide)) displayed enantioselectivity for geranyl-butyrate, a number of the 
novel lipase homologues of the invention surprisingly display enantioselectivity for neryl- 
butyrate while other lipase homologue polypeptides displayed greater geranyl 

30 enantioselective lipase activity than the parental clones. See^ Hgures 1 and 2 which list the 
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enantioselectivity (i.e., either for geranyl butyrate or neryl butyrate) and selected 
enantiomeric ratio values for selected lipase homologues. 

As described in USSN 60/278934, the phrase "enzyme stereoselectivity" 
refers to the preference for one substrate stereoisomer or pseudo-stereoisomQX (if one form is 
5 labeled) over another or others in a chemical reaction catalyzed by an enzyme. When the 
stereoisomers are enantiomers, the phenonaenon is referred to as "enzyme enantioselectivity" 
and is quantitatively expressed by the enantiomeric excess or the enantiomeric ratio. 
"Enantiomeric excess" refers to the absolute difference between the mole or weight fractions 
of major (F(+)) and minor (F(.)) enantiomers (i.e., 1 F(+) - F(_) | ), where F(+) + F(.) = 1. The 
10 percent enantiomer excess is 100 1 F(+) - F(-) | . The enantiomeric ratio is determined by the 
following equation: 

ln[l-c(l+DE(p)] 

E = 

ln[l-c(l-DE(p)] 

15 where c= the percent total substrate conversion (expressed as a decimal), and DE(p) is the 
diastereomeric excess (i.e., the percent product of isomer "1" less the percent product of 
isomer "2"). 

Employing the methods described herein and in USSN 60/278934, it was 
determined that polypeptide sequences SEQ ID NOS: 55 to 74 displayed enantioselectivity 
20 for geranyl butyrate versus neryl butyrate. As an example, an E (Enantiomeric ratio) value 
for an exemplary newly discovered lipase homologue has a geranyl enantiomer of about 2, 
See, Rgure 2. 

A number of novel lipase homologues of the invention displayed 
enantioselectivity for geranyl butyrate versus neryl butyrate greater than that of the parental 
25 sequences. For example, 2 exemplary homologues having a preference for the geranyl 
enantiomer have E values of at least about 3 or more. 

Surprisingly, none of SEQ ID NO: 1-20 (SEQ NO ID 55-74 for corresponding 
polypeptides) displayed enantioselectivity for neryl butyrate,*yet a number of the other 
lipases of the present invention did displayed enantioselectivity for neryl butyrate versus 
30 geranyl butyrate, with E values for the neryl enantiomer of at least about 1.4 up to about 2,2 
for selected homologues. Figure 2. 
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Novel Substitutions 

Certain lipase homologues of the invention (e.g., SEQ ID NOS: 75 to 108) 
contain one or more of the following amino add substitutions: Lys at position 1, Thr at 
position 14, Ser at position 17, Arg at position 22, Glu at position 26, Pro at position 31, Gly 
5 at position 33, Glu at position 34, Pro at position 35, Pro or Thr at position 37, Ser or Lys at 
position 41, Gly at position 42, Arg or Glu at position 43, Ala at position 61, Tyr at position 
75, Gly at position 96, Ser at position 97, Thr at position 104, Ser at position 107, Ala at 
position 125, Gly at position 129, Val at position 134, Cys at position 138, Lys at position 
141, Lys at position 146, Thr at position 156, Met at position 160, Arg at position 166, or His 

10 at position 177, which are not found in equivalent anoino acid positions of related lipase 
sequences having GenBank Protein Accession Nos. AAA22574, CAB95850. CAB12664, 
BAA11406, CAA02196, CAA00273, CAB12064, BAA22231, and CAA00274. An 
equivalent amino acid position is defined supra as an amino acid position of a test 
polypeptide which aligns with an amino acid position of SEQ ID NO:75 {see, supra). 

15 Preferred amino acid substitutions include those which are observed in a 

number of the Upase homologues of the invention which display enantioselectivity for 
geranyl butyrate versus neryl butyrate (e.g., having E values of at least about 3 for the 
geranyl enantiomer): Arg at position 22, Gly at position 33, Ser or Lys at position 41, Arg at 
position 43, Ser at position 107, Lys at position 141, Lys at position 146, Met at position 160, 

20 and His at position 177. More preferred substitutions include those which are observed only 
in lipase homologues of the invention which display enantioselectivity for geranyl 
enantiomer Arg at position 43 and Ser at position 107. 

Preferred amino acid substitutions also include those which are observed in a 
number of the lipase homologues of the invention which display enantioselectivity for neryl 

25 butyrate versus geranyl butyrate (e.g., having E values of at least about 1 .4 for the neryl 

enantiomer): Ser at position 17, Arg at position 22, Pro at position 31, Gly at position 33, Ser 
or Lys at position 41, Lys at position 141, Lys at position 146, Met at position 160, Arg at 
position 166, or His at position 177. More prefeaxed substitutions include those which are 
observed only in lipase homologues of the invention which display enantioselectivity for the 

30 neryl enantiomer. Ser at position 17, Pro at position 31, and Arg at position 166. 
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The nucleic acid sequences of the current invention (i.e., SEQ ID NO: 1 
through SEQ ID NO: 54) can be recombined (or further recombined) in accordance with the 
methods described herein and expressed in, e.g., E. coli to generate additional lipase variants. 
Lipase activity can be screened for on, e.g., tributyrin and further parameters such as, e.g. 
5 thermostability, lipase activity on novel substrates (i.e., on substrates on which known lipase 
variants do not have activity, etc.) can be selected for. 

Making Polynucleotides 

Polynucleotides and oligonucleotides of the invention can be prepared by 
standard solid-phase methods, according to known synthetic methods. Typically, fragments 

10 of up to about 100 bases are individually synthesized, then joined (e.g., by enzymatic or 
chemical hgation methods, or polymerase mediated recombination methods) to form 
essentially any desired continuous sequence. For example, the polynucleotides and 
oligonucleotides of the invention can be prepared by chemical synthesis using, e.g., the 
classical phosphoramidite method described by Beaucage et al., (1981) Tetrahedron Letters 

15 22:1859-69, or the method described by Matthes et al., (1984) EMBO J 3: 801-05., e.g., as is 
typically practiced in automated synthetic methods. According to the phosphoramidite 
method, oligonucleotides are synthesized, e.g., in an automatic DNA synthesizer, purified, 
annealed, ligated and cloned in appropriate vectors. 

In addition, essentially any nucleic acid can be custom ordered from any of a 

20 variety of commercial sources, such as The Midland Certified Reagent Company 

(mcrc@oIigos.com), The Great American Gene Company (www.genco.com), ExpressGen 
Inc. (www.expressgen.com), Operon Technologies Inc. (Alameda, CA) and many others. 
Similarly, peptides and antibodies can be custom ordered from any of a variety of sources, 
such as PeptidoGenic (pkim@ccnet.com), HIT Bio-products, inc. (www.htibio.com), BMA 

25 Biomedicals Ltd. (U.EL), Bio.Synthesis, Inc., and many others. 

Certain polynucleotides of the invention may also obtained by screening 
cDNA libraries (e.g., libraries generated by recombining homologous nucleic acids as in 
typical recursive recombination methods) using oligonucleotide probes which can hybridize 
to, or PCR-amplify, polynucleotides which encode the novel lipase polypeptides and 

30 fragments of those polypeptides. Procedures for screening and isolating cDNA clones are 
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well-known to those of skill in the art. Such techniques are described in, for example, 
Sambrook et al. (1989) supra, and Ausubel EM et al. (1989; supplemented through 2000) 
supra. 

As described in more detail herein, the polynucleotides of the invention 
5 include sequences which encode novel lipase homologues and sequences complementary to 
the coding sequences,' and novel fragments of coding sequence and complements thereof. 
The polynucleotides can be in the form of RNA or in the form of DNA, and include mRNA, 
cRNA, synthetic RNA and DNA, and cDNA. The polynucleotides can be double-stranded or 
single-stranded, and if single-stranded, can be the coding strand or the non-coding (anti- 

10 sense, complementary) strand. Hie polynucleotides optionally include the coding sequence 
of a novel lipase homologue (i) in isolation, (ii) in combination with additional coding 
sequence, so as to encode, e.g., a fusion protein, a precursor protein, a protein comprising a 
leader sequence, or the like, (iii) in combination with non-coding sequences, such as introns 
(including artificial introns), control elements such as a promoter, a terminator element, or 5' 

15 and/or 3* untranslated regions effective for expression of the coding sequence in a suitable 
host, and/or (iv) in a vector or host environment in which the novel lipase coding sequence is 
a heterologous gene. Sequences can also be found in combination with typical compositional 
formulations of nucleic acids, including in the presence of carriers, buffers, adjuvants, 
excipients and the like. 

20 Using Polynucleotides 

The polynucleotides (and polypeptides) of the invention have a variety of uses 

including, but not limited to, for example; recombinant production (i.e., expression) of the 

recombinant lipase polypeptides of the invention for industrial and other uses (e.g., especially 

as components of cleaning solutions such as laundry detergents, dish detergents, industrial 

25 cleansers (e.g., for s^tic systems, grease traps, machinery parts, etc,)); as therapeutic and 
prophylactic agents in methods of in vivo and ex vivo treatment of a variety of diseases, 
disorders, and conditions; for use in in vitro methods, such as diagnostic and screening 
methods, to detect, diagnose, and treat a variety of diseases, disorders, and conditions (e.g., 
pancreatic disorders) in a variety of subjects (e.g., mammals); as immunogens; in gene 

30 therapy methods and DNA- or RNA-based delivery methods to deliver or administer in vivo, 
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ex vivo, or in vitro, biologically active polypeptides of the invention to a tissue, population of 
cells, organ, graft, bodily system of a subject (e.g., organ system, lymphatic system, blood 
system, etc.); as diagnostic probes for die presence of complementary or partially 
complementary nucleic acids (including for detection of natural lipase coding nucleic acids); 
5 as substrates for further reactions, e.g., recursive recombination reactions, mutation reactions, 
or other diversity generation reactions to produce new and/or improved lipase homologues, 
and new lipase nucleic acids encoding such homologues, e.g., to evolve novel therapeutic, 
prophylactic, or industrial properties, and the like; for polymerase chan reactions (PGR) or 
cloning methods, e.g., including digestion or ligation reactions, to identify new and/or 

10 improved naturally-occurring or non-naturally occurring lipase nucleic acids and 

polypeptides encoded therefrom. Polynucleotides which encode a lipase homologue of the 
invention, or complements of the polynucleotides, are optionally administered to a cell to 
accomplish a therapeutically or prophylactically useful process or to express a therapeutically 
useful product in vivo, ex vivo, or in vitro. 

15 The present invention provides an isolated or recombinant nucleic acid 

comprising a polynucleotide sequence selected from: a polynucleotide sequence selected 
from SEQ ID NO: 1 to SEQ ID NO: 54 (or a complementary polynucleotide sequence 
thereof); a polynucleotide sequence encoding a polypeptide selected from SEQ ID NO: 55 to 
SEQ ID NO: 108 (or a complementary polynucleotide thereof); a polynucleotide sequence 

20 which hybridizes under highly stringent conditions over substantially the entire length of 
such polynucleotide sequences or which hybridizes to a subsequence thereof of at least 100 
residues provided that the polynucleotide sequence does not correspond to or encode any of 
GenBank accession numbers 1I6WA, 1I6WB, A02813, A02815^4992, AAA22574, 
AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617, AF134840, 

25 AF141874, AF237623, AJ297356, BAAl 1406, BAA22231, BAB05967, C69652, 

CAA00273, CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, 
CAB92662, CAB95850, D78508, E01340, E01903. E02083, E05047, JW0068, M74010, 
P37957, S23934, U78785, X95309, Z99105, and Z99108; and a polynucleotide sequence 
comprising all or a fragment of any of the previous polynucleotides and which comprises 
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lipase activity and which does not correspond to or encode any of the above GenBank 
accession numbers. 

Other embodiments of the invention can comprise an isolated or recombinant 
nucleic acid which comprises a polynucleotide sequence which encodes a polypeptide having 
5 an amino acid sequence that is substantially identical over at least 45, at least 50, at least 75, 
at least 100, at least 125, at least 150, at least 175, or at least 200 contiguous amino acid 
residues of any of SEQ ID NO: 55 to SEQ ID NO: 108 provided that the polynucleotide 
sequence does not correspond to or encode any of GenBank accession numbers 1I6WA, 
1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, AAC12257, AAD30278, 

10 AAF40217, AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, 
BAA11406, BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196, 
CAA64621, CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, D78508, E01340, 
E01903, E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, 
and Z99108, Additionally, the invention provides an isolated or recombinant nucleic acid 

15 which comprises a polynucleotide sequence which encodes a polypeptide having an amino 
acid sequence that is substantially identical over at least 180, at least 212, at least 213, or at 
least 215 contiguous amino acid residues of any of SEQ ID NO: 55 to SEQ ID NO: 108, 
provided that the sequence does not conrespond to or encode any of the GenBank accession 
numbers listed above. 

20 Furthermore, the invention provides such nucleic acids as described wherein 

the encoded polypeptide comprises lipase activity (e.g., against tributyrin, against tributyrin 
in DMF (dimethyl fonnamide), against tributyrin after being heat treated (i.e., after the 
polypeptide has been heat treated); and/or comprises enantioselective lipase activity (e.g., 
against neryl- butyrate or geranyl- butyrate). Optionally, such nucleic acids as described can 

25 encode polypeptides which comprise lipase activity against novel substrates (i.e., substrates 
upon which typical wild-type lipases do not act) such as, e.g., methyl esters, pentadecanolide, 
or oxacyclotridecan. The invention also includes isolated or recombinant nucleic acids that 
comprise a polynucleotide sequence which encodes a polypeptide comprising lipase activity 
and which is produced by mutating or recombining one or more polynucleotide sequence as 

30 described above (and which optionally comprises lipase activity) providing that the sequence 
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does not coirespond to or encode any of tbe GenBank accession sequences above. The 
invention additionally provides any of the above described nucleic acids wherein the encoded 
polypeptide comprises: a leader sequence; a precursor peptide, an epitope tag sequence; or a 
fusion protein comprising one or more additional nucleic acid 
5 A composition comprising two or more nucleic acids as described above, as 

well as such compositions that comprise a library (e.g., of at least about 2, 5, 10, 50, or more 
nucleic acids) is also a feature of the invention. Such compositions are optionally produced 
by cleaving of one or more nucleic acid (e.g., by mechanical, chemical or enzymatic (e.g., a 
restriction endonuclease, an RNAse, a DNAse, etc.) means) of any of the above described 
10 nucleic acids. Compositions produced by incubating one or more of any of the above 

described polynucleotides in the presence of deoxyribonucleotide triphosphates and a nucleic 
acid polymerase (e.g., a thermostable polymerase) are also provided in the current invention. 
Additionally, the invention provides a cell (which optionally expresses a polypeptide 
encoded by the nucleic acid) comprising at least one nucleic acid as described above (or a 
15 cleaved or amplified fragment or product thereof). Vectors and/or expression vectors (e.g., 
plasmids, cosmids, phages, viruses, virus fragments, etc.) comprising any nucleic acid as 
described above, as well as any cell transduced by such vectors are also provided. 
Compositions comprising any nucleic acid as described above and an excipient (optionally a 
pharmaceutically acceptable excipient are also provided in the invention). 

EXPRESSION OF POLYPEPTTDBS 

In accordance with the present invention, polynucleotide sequences which 
encode novel lipase homologues (including mature lipase homologues), fi^tgments of lipase 
proteins, related' fusion proteins, or functional equivalents thereof, collectively referred to 
herein, e.g., as "lipase homologue polypeptides," "novel lipase polypeptides," or "lipase 
polypeptides" are used in recombinant DNA molecules that direct the expression of the 
lipase homologue polypeptides in appropriate host cells. Due to the inherent degeneracy of 
the genetic code, other nucleic acid sequences which encode substantially the same or a 
functionally equivalent amino acid sequence are also used to clone and express the lipase 
homologues. 
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Modified Coding Sequences: 

As will be understood by those of skill in the art, it can be advantageous to 
modify a coding sequence to enhance its expression in a particular host. The genetic code is 
redundant with 64 possible codons, but most organisms preferentially use a subset of these 
5 codons. The codons that are utilized most often in a species are called optimal codons, and 
those not utilized very often are classified as rare or low-usage codons {see^ e.g., ^ang SP et 
al. (1991) Gene 105:61-72). Codons can be substituted to reflect the prefened codon usage 
of the host, a process called "codon optimization*' or "controlling for species codon bias." 

Optimized coding sequence containing codons preferred by a particular 

10 prokaryotic or eukaryotic host {see also, Murray, E. et al. (1989) Nuc Acids Res 17:477-508) 
can be prepared, for example, to increase the rate of translation or to produce recombinant 
RNA transcripts having desirable properties, such as a longer half-life, as compared with 
transcripts produced from a non-optimized sequence. Translation stop codons can also be 
modified to reflect host preference. For example, preferred stop codons for S. cerevisiae and 

15 mammals are UAA and UGA respectively. The preferred stop codon for monocotyledonous 
plants is UGA. whereas insects and E, coli prefer to use UAA as the stop codon (Dalphin ME 
et al. (1996) Nuc Acids Res 24: 216-218). 

The polynucleotide sequences of the present invention can be engineered in 
order to alter lipase homologue coding sequences for a variety of reasons, including but not 

20 limited to, alterations which modify the cloning, processing and/or expression of the gene 
product. For example, alterations may be introduced using techniques which are well known 
in the art, e.g., site-directed mutagenesis, to insert new restriction sites, to alter ^ycosylation 
patterns or other conjugation patterns, to change codon preference, to introduce splice sites, 
to introduce or remove introns, etc. 

25 Vectors, Promoters and Expression Systems, 

The present invention also includes recombinant constructs comprising one or 

more of the nucleic add sequences as broadly described above. The constructs comprise a 

vector, such as, a plasmid, a cosmid, a phage, a virus, a bacterial artificial chromosome 

(BAG), a yeast artificial chromosome (YAC), and the like, into which a nucleic acid 

30 sequence of the invention has been inserted, e.g., a polynucleotide encoding a lipase 

homologue. in a forward or reverse orientation. In a preferred aspect of this embodiment, the 
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construct further comprises regulatory sequences, including, for example, a promoter, 
operably linked to the sequence. Large numbers of suitable vectors and promoters are known 
to those of skill in the art, and are commercially available. 

General texts which describe molecular biological techniques useful herein, 
5 including the use of vectors, promoters and many other relevant topics, include Berger and 
Kinunel, Guide to Molecular Cloning Techniques, Methods in Enzvmologv volume 152 
Academic Press, Inc., San Diego, CA (Berger); Sambrook et al.. Molecular Cloning - A 
Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, 1989 ("Sambrook") and Current Protocols in Molecular Biology . 

10 Ausubel et al., eds.. Current Protocols, a joint venture between Greene Publishing Associates, 
Inc. and John Wiley & Sons, Inc., (supplemented through 2000) ("Ausubel")). Examples of 
techniques sufficient to direct persons of skill through in vitro amplification methods, 
including the polymerase chain reaction (PCR) the ligase chain reaction (LCR), QP-replicase 
amplification and other RNA polymerase mediated techniques (e.g., NASBA), e.g., for the 

15 production of the homologous nucleic acids of the invention are found in Berger, Sambrook, 
and Ausubel, as well as Mullis et al., (1987) U.S. Patent No. 4,683,202; PCR Protocols A 
Guide to Methods and Applications (Innis et al. eds.) Academic Press Inc. San Diego, CA 
(1990) (Innis); Amheim & Levinson (October 1, 1990) C&EN 36-47; The Journal Of NIH 
Research (1991) 3, 81-94; (Kwoh et al. (1989) ProcNatl Acad Sci USA 86, 1173; Guatelli et 

20 al.' (1990) ProcNatl Acad Sci USA 87, 1874; LomeU et al. (1989) J CUn Chem 35, 1826; 
Landegren et al., (1988) Science 241, 1077-1080; Van Brunt (1990) Biotechnology 8, 291- 
294; Wu and WaUace, (1989) Gene 4, 560; Batringer et al. (1990) Gene 89, 117, and 
Sooknanan and Malek (1995) Biotechnology 13: 563-564. Improved methods of cloning in 
vitro amplified nucleic acids are described in Wallace et al., U.S. Pat No. 5,426,039. 

25 Improved methods of amplifying large nucleic acids by PCR are summarized in Cheng et al. 
(1994) Nature 369: 684-685 and the references therein, in which PCR amplicons of up to 
40kb are generated. One of skill will appreciate that essehtiaUy any RNA can be converted 
into a double stranded DNA suitable for restriction digestion, PCR expansion and sequencing 
using reverse transcriptase and a polymerase. See, Ausubel, Sambrook and Berger, all supra. 
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The present invention also relates to host cells which are transduced with 
vectors of the invention, and the production of polypeptides of the invention by recombinant 
techniques. Host cells are genetically engineered (i.e., transduced, transformed or 
transfected) with the vectors of this invention, which can be, for example, a cloning vector or 
5 an expression vector. The vector can be, for example, in the form of a plasmid, a viral 
particle, a phage, etc. The engineered host cells can be cultured in conventional nutrient 
media modified as appropriate for activating promoters, selecting transformants, or 
amplifying the lipase homologue gene. The culture conditions, such as temperature, pH and 
the like, are those previously used with the host cell selected for expression, and will be 

10 apparent to those skiUed in the art and in the references cited herein, including, e.g., Freshney 
(1994) Culture of Animal Cells, a Manual of Basic Teclmique, third edition, Wiley- liss, 
New York and the references cited therein. 

The lipase homologue proteins of the invention can also be produced in non- 
animal cells such as plants, yeast, fungi, bacteria and the like. In addition to Sambrook, 

15 Berger and Ausubel, details regarding cell culture can be found in Payne et al (1992) Plant 
Cell and Tissue Culture in Liquid Systems John Wiley & Sons, Inc. New York, NY; 
Gamborg and Phillips (eds.) (1995) Plant Cell, Tissue and Organ Culture; Fundamental 
Methods Springer Lab Manual, Springer- Verlag (Berlin Heidelberg New York) and Atias 
and Parks (eds.) Tlie Handbook of Microbiological Media (1993) CRC Press, Boca Raton, 

20 FL. 

The polynucleotides of the present invention may be included in any one of a 
variety of expression vectors for expressing a polypeptide. Such vectors include 
chromosomal, nonchromosomal and synthetic DNA sequences, e.g., derivatives of SV40; 
bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors derived from 

25 combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox 
virus, pseudorabies, adeno-associated virus, retroviruses and many others. Any vector that 
transducers genetic material into a cell, and, if replication is desired, which is replicable and 
viable in the relevant host can be used. 

The nucleic acid sequence in the expression vector is op^tively linked to an 

30 appropriate transcription control sequence (promoter) to direct mRNA synthesis. Examples 
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of such promoters include: LTR or SV40 promoter, E, coli lac or trp promoter, phage lambda 
Pl promoter, and other promoters known to control expression of genes in prokaryotic or 
eukaryotic cells or their viruses. The expression vector also contains a ribosome binding site 
for translation initiation, and a transcription terminator. The vector optionally includes 
5 appropriate sequences for amplifying expression. In addition, the expression vectors 
optionally comprise one or more selectable marker genes to provide a phenotypic trait for 
selection of transformed host cells, such as dihydrofolate reductase or neomycin resistance 
for eukaryotic cell culture, or such as tetracycline or ampicilhn resistance in, e.g., E. colL 
The vector containing the ^propriate DNA sequence as described above, as 

10 well as an appropriate promoter or control sequence, may be employed to transform an 
appropriate host to permit the host to express the protein. Examples of appropriate 
expression hosts include: bacterial cells, such as E. coli, StreptorttyceSy and Salmonella 
typhimuriwn; fungal cells, such as Saccharomyces cerevisiae, Pichia pastoris, and 
Neurospora crassa; insect cells such as DrosophUa and Spodoptera frugiperda; mammalian 

15 cells such as CHO, COS, BHK, HEK 293 or Bowes melanoma; plant cells, etc. It is 

understood that not all cells or cell lines need to be capable of producing fully functional 
lipase homologues; for example, antigenic fragments of lipase can be produced in a bacterial 
or other expression system. The invention is not limited by the host cells employed. 

In bacterial systems, a number of expression vectors may be selected 

20 depending upon the use intended for the lipase homologue. For example, when large 
quantities of lipase homologue, or fragments thereof, are needed for the induction of 
antibodies, vectors which direct high level expression of fusion proteins that are readily 
purified may be desirable. Such vectors include, but are not limited to, multifunctional E. 
coli cloning and expression vectors such as BLUESCRIPT (Stratagene), in which the novel 

25 lipase coding sequence can be ligated into the vector in-frame with sequences for the amino- 
terminal Met and the subsequent 7 residues of beta-galactosidase so that a hybrid (or fusion) 
protein is produced; pIN vectors (Van Heeke & Schuster (1989) J Biol Chem 264:5503- 
5509); pET vectors (Novagen, Madison WI); and the like. 

Similarly, in the yeast Saccharomyces cerevisiae a number of vectors 

30 containing constitutive or inducible promoters such as alpha factor, alcohol oxidase and PGH 
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may be used for production of the lipase homologue proteins of the invention. For reviews, 
see Ausubel et al. (supra) and Grant et al. (1987; Methods in Bnzvmologv 153:516-544). 

In manMnalian host cells, a number expression systems, such as viral-based 
systems, can be utilized. In cases where an adenovirus is used as an expression vector, a 
5 coding sequence is optionally ligated into an adenovirus transcription/translation complex 
consisting of the late promoter arid tripartite leader sequence. Insertion in a nonessential El 
or E3 region of the viral genome will result in a viable virus capable of expressing lipase 
homologues in infected host cells (Logan and Shenk (1984) Proc Natl Acad Sci 81:3655- 
3659). In addition, transcription enhancers, such as the rous sarcoma virus (RSV) enhancer, 
10 can be used to increase expression in manamalian host cells. 

Additional Expression Elements 

Specific initiation signals can aid in efficient translation of a lipase homologue 
coding sequence. These signals can include, e.g., the ATG initiation codon and adjacent 
sequences. In cases where lipase homologue coding sequence, its initiation codon and 

15 upstream sequences are inserted into the appropriate expression vector, no additional 

translational control signals may be needed. However, in cases where only coding sequence 
(e.g., a mature protein coding sequence), or a portion thereof, is inserted, exogenous 
translational control signals including the ATG initiation codon must be provided. 
Furthermore, the initiation codon must be positioned in the correct reading frame to ensure 

20 translation of the entire insert to generate the desired polypeptide. Exogenous transcriptional 
and/or translational elements and initiation codons can be of various origins, both natural and 
synthetic. Hie efficiency of expression can be enhanced by the inclusion of enhancers 
appropriate to the cell system in use (Scharf D et al. (1994) Results Probl Cell Differ 20:125- 
62; Bittner et al. (1987) Methods in Enzvmol 153:516-544). 

25 Secretion/Localization Sequences 

Polynucleotides of the invention can also be fused, for example, in-firame to 

nucleic acid encoding a secretion/localization sequence, to target polypeptide expression to a 

desired cellular compartment, membrane, or organelle, or to direct polypeptide secretion to 

the periplasmic space or into the ceU culture media. Such sequences are known to those of 

30 skill, and include secretion leader peptides, organelle targeting sequences (e.g., nuclear 
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localization sequences, ER retention signals, mitochondrial transit sequences, chloroplast 
transit sequences), membrane localization/anchor sequences (e.g., stop transfer sequences, 
GPI anchor sequences), and the like. 

Expression Hosts 

5 In a further embodiment, the present invention relates to host cells containing 

the above-described constructs, e.g., vectors comprising lipase homologues. The host cell 
can be a eukaryotic cell, such as a manamalian cell, a yeast cell, or a plant cell, or the host 
cell can be a prokaryotic cell, such as a bacterial cell (e.g., an E. coli cell). Introduction of 
the construct into the host cell can be effected by calcium phosphate transfection, DEAE- 
10 Dextran mediated transfection, electroporation, or other common techniques (Davis, L., 
Dibner, M., and Battey, I. (1986) Basic Methods in Molecular Biology, Sambrook and 
Ausubel, supra.), 

A host cell strain is optionally chosen for its ability to modulate the expression 
of the inserted sequences or to process the expressed protein in the desired fashion. Such 

15 modifications of the protein include, but are not limited to, acetylation, carboxylation, 

glycosylation, phosphorylation, lipidation and acylation. Post-translational processing which 
cleaves a precursor form into a mature form of the protein may also be important for correct 
insertion, folding and/or function. Different host cells such as COS, CHO, HeLa, BHK, 
MDCK, 293, WB8, etc. have specific cellular machinery and characteristic mechanisms for 

20 such post-translational activities and can be chosen to ensure the correct modification and 
processing of the introduced, foreign protein. 

. For long-term, high-yield production of recombinant proteins, stable 
expression can be used For example, cell lines which stably express a polypeptide of the 
invention are transduced using expression vectors which contain viral origins of replication 

25 or endogenous expression elements and a selectable marker gene. Following the introduction 
of the vector, cells can be allowed to grow for 1-2 days in an enriched media before they are 
switched to selective media. The purpose of the selectable marker is to confer resistance to 
selection, and its presence allows growth and recovery of cells which successfully express 
the introduced sequences. For example, resistant clumps of stably transformed cells can be 

30 proUfCTated using tissue culture techniques appropriate to the cell type. 
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Host cells transformed with a nucleotide sequence encoding a polypeptide of 
the invention axe optionally cultured under conditions suitable for the expression and 
recovery of the encoded protein from cell culture. The protein or fragment thereof produced 
by a recombinant cell can be secreted, membrane-bound, or contained intracellularly, 
5 depending on the sequence and/or the vector used. As will be understood by those of skill in 
the art, expression vectors containing polynucleotides encoding lipase homologues of the 
invention can be designed with signal sequences which direct secretion of the polypeptides 
through a prokaryotic or eukaryotic cell membrane. 

Additional Polypeptide Sequences 
10 The polynucleotides of the present invention may also comprise a coding 

sequence fused in-frame to a marker sequence which, e.g., facilitates purification of die 

encoded polypeptide. Such purification faciUtating domains include, but are not limited to, 

metal chelating peptides such as histidine-tryptophan modules that allow purification on 

immobilized metals, a sequence which binds glutathione (e.g., GST), a hemagglutinin (HA) 

15 tag (corresponding to an epitope derived from the influenza hemagglutinin protein; Wilson, 
I., et al. (1984) Cell 37:767), maltose binding protein sequences, the FLAG epitope utilized 
in the FLAGS extension/affinity purification system (Inamunex Corp, Seattle, WA), and the 
like. The inclusion of a protease-cleavable polypeptide linker sequence between the 
purification domain and the lipase homologue sequence is useful to facilitate purification. 

20 For example, one expression vector contemplated for use in the compositions 

and methods described herein provides for expression of a fusion protein comprising a 
polypeptide of the invention fiised to a polyhistidine region separated by an enterokinase 
cleavage site. The histidine residues facilitate purification on IMIAC (immobilized metal ion 
affinity chromatography, as described in Porath et al. (1992) Protein Expression and 

25 Purification 3 :263-281) while the enterokinase cleavage site provides a means for separating 
the lipase homologue polypeptide from the fusion protein. pGEX vectors (Promega; 
Madison, WI) can also be used to express foreign polypeptides as fusion proteins with 
glutathione S-transferase (GST). In general, such fusion proteins are soluble and can easily 
be purified from the culture medium or from lysed cells by adsorption to ligand-agarose 
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beads (e.g.. glutathione-agarose in the case of GST-fusions) followed by elution in the 
presence of free ligand. 

Polypeptide Production and Recovery 

Following transduction of a suitable host cell line or strain and growth of the 
5 host strain to an appropriate cell density, the selected promoter is induced by appropriate 
means (e.g., temperature shift or chemical induction) and cells are cultured for an additional 
period. The secreted polypeptide product is then recoyered from the culture medium. 
Alternatively, cells can be harvested by centrifugation, disrupted by physical or chemical 
means, and the resulting cmde extract retained for further purification. Eukaryotic or 

10 microbial cells employed in expression of proteins can be disrapted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell 
lysing agents, or other methods, which are well know to those skilled in the art. 

As noted, many references are available for the culture and production of 
many cells, including cells of bacterial, plant, animal (especially mammalian) and 

15 archebacterial origin. See, e.g., Sambrook, Ausubel, and Berger {all supra), as well as 

Freshney (1994) Culture of Animal Cells, a Manual of Basic Technique, third edition, Wiley- 
liss, New York and the references cited therein; Doyle and Griffiths (1997) Mammalian Cell 
Culture: Essential Techniques John Wiley and Sons, NY; Hiunason (1979) Animal Tissue 
TecJmiques, fourth edition W.HL Freeman and Company; and Ricciardelli, et al., (1989) hi 

20 vitro Cell Dev Biol 25:1016-1024. For plant cell culture and regeneration, see, e.g., Payne et 
al. (1992) Plant Cell and Tissue Culture in Liquid Systems John Wiley & Sons, Inc. New 
Yoric, NY; Gamborg and Phillips (eds.) (1995) Plant Cell, Tissue and Organ Culture; 
Fundamental Methods Springer Lab Manual, Springer- Verlag (Berlin Heidelberg New Yoric) 
and Plant Molecular Biology (1993) R. R. D. Croy, Ed Bios Scientific Publishers, Oxford, 

25 U.K. ISBN 0 12 198370 6. Cell culture media in general are set forth in Atlas and Parks 
(eds.) The Handbook of Microbiological Media (1993) CRC Press, Boca Raton, FL. 
Additional information for cell culture is found in available commercial literature such as the 
Life Science Research Cell Culture Catalogue (1998) fi:om Sigma- Aldrich, Inc (St Louis, 
MO) ("Sigma-LSRCCCO and, e.g., the Plant Culture Catalogue and supplement (1997) also 

30 from Sigma- Aldrich, Inc (St Louis, MO) ("Sigma-PCCS"). 
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Polypeptides of the invention can be recovered and purified from recombinant 
cell cultures by any of a number of methods well known in the art, including ammonium 
sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, 
phosphocellulose chromatography, hydrophobic interaction chromatography, affinity 
5 chromatography (e.g., using any of the tagging systems noted herein), hydroxylapatite 

chromatography, and lectin chromatography. Protein refolding steps can be used, as desired, 
in completing configuration of the mature protein. Finally, high performance liquid 
chromatography (HPLC) can be employed in the final purification steps. In addition to the 
references noted above, a variety of purification methods are well known in the art, 

10 including, e.g., those set forth in Sandana (1997) Bioseparation of Proteins, Academic Press, 
Inc.; and BoUag et al. (1996) Protein Methods, 2^"^ Edition Wiley-Uss, NY; Walker (1996) 
The Protein Protocols Handbook Humana Press, NJ, Harris and Angal (1990) Protein 
Purification Applications: A Practical Approach IRL Press at Oxford, Oxford, England; 
Harris and Angal Protein Purification Metlwds: A Practical Approach IRL Press at Oxford, 

15 Oxford, England; Scopes (1993) Protein Purification: Principles and Practice Edition 
Springer Verlag, NY; Janson and Ryden (1998) Protein Purification: Principles, High 
Resolution Methods and Applications, Second Edition Wiley- VCH, NY; and Walker (1998) 
Protein Protocols on CD-ROM Humana Press, NJ, 

In vitro Expression Systems 
20 Cell-fijee transcription/translation systems can also be employed to produce 

polypeptides comprising lipase homologues, and fragments thereof, using DNAs or RNAs of 

the present invention. Several such systems are commercially available. A general guide to 

in vitro transcription and translation protocols is found in Tymms (1995) In vitro 

Transcription and Translation Protocols: Methods in Molecular Biologv Volume 37. Garland 

25 PubUshing,NY. 

Modified Amino Acids 

Polypeptides of the invention can contain one or more modified amino acid. 
The presence of modified amino acids can be advantageous in, for example, (a) increasing 
polypeptide serum half-life, (b) reducing polypeptide antigenicity, and (c) increasing 
30 polypeptide storage stability. Amino acid(s) are modified, for example, co-translationaUy or 
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post-translationally during recombinant production (e.g., N-linked glycosylation at N-X-S/T 
motifs during expression in mammalian cells) or modified by synthetic means. 

Non-limiting examples of a modified amino acid include a glycosylated 
amino acid, a sulfated amino acid, a prenlyated (e.g., famesylated, geranylgeranylated) 
5 an^no acid, an acetylated amino acid, an acylated amino acid, a PEG-ylated amino acid, a 
biotinylated amino acid, a carboxylated amino acid, a phosphorylated aniino acid, and the 
like, as well as amino acids modified by conjugation to, e.g., lipid moieties or other organic 
derivatizing agents. References adequate to guide one of skill in the modification of amino 
acids are replete throughout the literature. Example protocols are found in Walker (1998) 
10 Protein Protocols on CD-ROM Human Press, Towata, NJ. 

IN VIVO USES 

Polynucleotides which encode a lipase homologue of the invention, or 
complements of the polynucleotides (i.e., antisense polynucleotides), are optionally 
administered to a cell to accomplish a therapeutically useful process or to express a 
15 therapeutically useful product. These in vivo applications, including gene therapy, include a 
multitude of techniques by which gene expression can be altered in cells. Such methods 
include, for instance, the introduction of genes for expression of, e.g., therapeutically and/or 
prophylactically useful polypeptides, such as the lipase homologues of the present invention 
to, e.g., hydrolyze ester bonds of lipids, e.g., in the treatment of, e.g., Crohn's disease, etc. 

20 In Vivo Polvpeptide Expression 

Polynucleotides encoding lipase homologue polypeptides of the invention are 

useful for in vivo therapeutic applications, including prophylactic applications, using 

techniques weU known to those skilled in the art. For example, cultured cells are engineered 

ex vivo with a polynucleotide (DNA or RNA), with the engmieered cells then being returned 

25 to the patient Cells may also be engineered in vivo for expression of a polypeptide in vivo. 

As noted, and as described in more detail below, lipase production is also useful for a variety 

of industrial processes, including lipid degradation, and regio- or stereo-selective reaction 

with lipids. 

A number of viral vectors suitable for organismal in vivo transduction and 
30 expression axe known. Such vectors include retroviral vectors {see^ Miller (1992) Curt Top 
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Microbiol Immunol 158: 1-24; Salmons and Gunzburg (1993) Human Gene Therapy 4:129- 
141; Miller et al. (1994) Methods in Enzvmologv 217: 581-599) and adeno-associated 
vectors (reviewed in Carter (1992) Curr Opinion Biotech 3: 533-539; Muzcyzka (1992) Curr 
Top Micro btnl Tmmnnnl 158: 97-129). Other viral vectors that are used include adenoviral 
5 vectors, herpes viral vectors and Sindbis viral vectors, as generally described in, e.g.. Jolly 

(1994) Cancer Gene Therapy 1:51-64; Latchman (1994) Molec Biotechnol 2:179-195; and 
Johanning et al. (1995) Nucl Acids Res 23:1495-1501. 

Gene therapy provides methods for combating chronic infectious diseases 
(e.g., KnV infection, viral hepatitis), as well as non-infectious diseases including cancer and 

10 some forms of congenital defects such as enzyme deficiencies. Several approaches for 
introducing nucleic acids into cells in vivo, ex vivo and in vitro have been used. These 
include liposome based gene delivery (Debs and Zhu (1993) WO 93/24640 and U.S. Pat. No. 
5,641,662; Mannino and Gould-Fogerite (1988) BioTechniques 6(7): 682-691; Rose, U.S. 
Pat No. 5,279,833; Brigham (1991) WO 91/06309; and Feigner et al. (1987) Proc Nad Acad 

15 Sci USA 84: 7413-7414); Brigham et aL (1989) Am J Med Sci, 298:278-281; Nabel et al. 
(1990) Science , 249:1285-1288; Hazinski et al. (1991) Am J Resp Cell Molec Biol, 4:206- 
209; and Wang and Huang (1987) Proc Natl Acad Sci USA , 84:7851-7855).; adenoviral 
vector mediated gene delivery, e.g., to treat cancer (see, e.g., Chen et al. (1994) Proc Natl 
Acad Sci USA 91: 3054-3057; Tong et al. (1996) Gynecol Oncol 61: 175-179; Clayman et 

20 al. (1995) Cancer Res 5: 1-6; O'Malley et aL (1995) Cancer Res 55: 1080-1085; Hwang et al. 

(1995) Am J Respir Cell Mol Biol 13: 7-16; Haddada et al. (1995) Curr Top Microbiol 
Immunol 199 (Pt. 3); 297-306; Addison et aL (1995) Proc Natl Acad Sci USA 92: 8522- 
8526; Colak et al. (1995) Brain Res 691: 76-82; Crystal (1995) Science 270: 404-410; 
Elshami et aL (1996) Human Gene TTier 7: 141-148; Vincent et al. (1996) JNeurosurg 85: 

25 648-654), and many other diseases. Replication-defective retroviral vectors harboring 
therapeutic polynucleotide sequence as part of the retroviral genome have also been used, 
particularly with regard to simple MuLV vectors. See, e.g.. Miller et aL (1990) Mol Cell Biol 
10:4239 (1990); Kolberg (1992) JNTORes 4:43, and Cometta et al. (1991) Hum Gene Ther 
2:215). Nucleic acid transport coupled to ligand-specific, cation-based transport systems (Wu 

30 and Wu (1988) JBiolChem, 263: 14621-14624) have also been used Naked DNA 
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expression vectors have also been described (Nabel et al. (1990), supra); Wolff et aL (1990) 
Science , 247: 1465-1468). In general, these approaches can be adapted to the invention by 
incorporating nucleic acids encoding the lipase homologues herein into the appropriate 
vectors. 

5 General texts which describe gene therapy protocols, which can be adapted to 

the present invention by introducing the nucleic acids of the invention into patients, include 
Robbins (1996) Gene Therapy Protocols, Humana Press, NJ, and Joyner (1993) Gene 
Targeting: A Practical Approachy IRL Press, Oxford, England, 

Antisense Technologv 
10 In addition to expression of the nucleic acids of the invention as gene 

replacement nucleic acids, the nucleic acids are also useful for sense and antisense 

suppression of expression, e.g., to down-regulate expression of a nucleic acid of the 

invention, once expression of the nucleic acid is no longer desired in the cell. Similarly, the 

nucleic acids of the invention, or subsequences or antisense sequences thereof, can also be 

15 used to block expression of naturally occurring homologous nucleic acids. A variety, of sense 

and anti-sense technologies are known in the art, e.g., as set forth in lichtenstein and Nellen 

(1997) Antisense Teclmology: A Practical Approach IRL Press at Oxford University, 

Oxford, England, and in Agrawal (1996) Antisense Therapeutics Humana Press, NJ, and the 

references cited therein. 

20 Pharmaceutical Compositions 

The polynucleotides of the invention may be employed for therapeutic uses in 

combination with a suitable pharmaceutical carrier. Such compositions comprise a 

therapeutically effective amount of the compound, and a pharmaceutically acceptable carrier 

or excipient Such a carrier or excipient includes, but is not limited to, saline, buffered saline, 

25 dextrose, water, glycerol, ethanol, and combinations thereof. Hie formulation should suit the 

mode of administration. Methods of administering nucleic acids and proteins are well known 

in the art, and further discussed below. 

Use as Probes 

Also contemplated are uses of polynucleotides, also referred to herein as 
30 oligonucleotides, typically having at least 12 bases, preferably at least 15, more preferably at 
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least 20, 30, or 50 bases, which hybridize under highly stringent conditions to lipase a 
polynucleotide sequence described above. The polynucleotides may be used as probes, 
primers, sense and antisense agents, and the like. 

SEQUENCE VARIATIONS 

Silent Variations 

It will be appreciated by those skilled in the art that due to the degeneracy of 

the genetic code, a multitude of nucleic acids sequences encoding novel lipase polypeptides 

of the invention may be produced, some which may bear minimal sequence homology to the 

nucleic acid sequences explicitly disclosed herein. 

Table 1 
Codon Table 


Amino acids 

Codon 

Alanine 

Ala 

A 

GCA 

GCC 

GCG 

GCU 



Cysteine 

Cys 

C 

UGC 

UGU 





Aspartic acid 

Asp 

D 

GAC 

GAU 





Glutamic acid 

Glu 

E 

GAA 

GAG 





Phenylalanine 

Phe 

F 

UUC 

UUU 





Glycine 

Gly 

G 

GGA 

GGC 

GGG 

GGU 



Histidine 

ffis 

H 

CAC 

CAU 





Isoleucine 

Be 

I 

AUA 

AUC 

AUU 




Lysine 

Lys 

K 

AAA 

AAG 





Leucine 

Leu 

L 

UUA 

UUG 

CUA 

cue 

CUG 

CUU 

Methionine 

Met 

M 

AUG 






Asparagine 

Asn 

N 

AAC 

AAU 





Ptoline 

Pro 

P 

CCA 

CCC 

CCG 

ecu 



Glutamine 

Gin 

Q 

CAA 

CAG 





Arginine 

Arg 

R 

AGA 

AGG 

CGA 

CGC 

CGG 

CGU 

Serine 

Ser 

S 

AGC 

AGU 

UCA 

UCC 

UCG 

UCU 

Threonine 

Thr 

T 

ACA 

ACC 

ACG 

ACU 



Valine 

Val 

V 

GUA 

GUC 

GUG 

GUU 



Tryptophan 

Tip 

W 

UGG 






Tyrosine 

Tyr 

Y 

UAC 

UAU 






For instance, inspection of the codon table (Table 1) shows that codons AGA, 
AGG, CGA, CGC, CGG, and CGU all encode the amino acid arginine. Thus, at every 
position in the nucleic acids of the invention where an arginine is specified by a codon, the 
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codon can be altered to any of the corresponding codons described above without altering the 
encoded polypeptide. It is understood that U in an RNA sequence corresponds to T in a 
DNA sequence. 

Using, as an example, the nucleic acid sequence of clone Ifl5(g2) 
5 corresponding to nucleotides 2-16 of SEQ ID NO: 21, GAA CAC AAT CCA GTT, a silent 
variation of this sequence includes GAG CAT AAC CCC GTG, both of which sequences 
encode the amino acid sequence EHNPV, which corresponds to amino acids 1-5 of SEQ ID 
NO: 75. 

Such "silent variations" are one species of "conservatively modified 
10 variations", discussed below. One of skill will recognize that each codon in a nucleic acid 
(except AUG and tJGG, which are ordinarily the only codons for methionine and tryptophan 
respectively) can be modified by standard techniques to encode a functionally identical 
polypeptide. Accordingly, each silent variation of a nucleic acid which encodes a 
polypeptide is implicit in any described sequence. The invention provides each and every 
15 possible variation of nucleic acid sequence encoding a polypeptide of the invention that 
could be made by selecting combinations based on possible codon choices. These 
combinations are made in accordance with the standard triplet genetic code (e.g.. as set forth 
in Table 1) as applied to the nucleic acid sequence encoding a Upase homologue polypeptide 
of the invention. All such variations of every nucleic acid herein are specifically provided 
20 and described by consideration of the sequence in combination with the genetic code. One of 
skill is fiiUy able to generate or select such variations based upon knowledge of the genetic 
code as well as considerations such as codon preferences of a specific organism chosen for 
expression of a polypeptide encoded by the nucleic acid. 

Conservative Variations 

25 "Conservatively modified variations" or, simply, "conservative variations" of 

a particular nucleic acid sequence refers to those nucleic acids which encode identical or 
essentially identical amino acid sequences, or, where the nucleic acid does not encode an 
amino acid sequence, to essentially identical sequences. One of skill will recognize that 
individual substitutions, deletions or additions which alter, add or delete a single amino acid 

30 or a small percentage of amino acids (typically less than 5%, more typically less than 4%, 2% 
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or 1%) in an encoded sequence are "conservatively modified variations" where the 
alterations result in the deletion of an amino acid, addition of an amino acid, or substitution 
of an amino acid with a chemically similar amino acid. 

Conservative substitution tables providing functionally similar amino acids 
5 are well known in the art. Table 2 sets forth six groups which contain amino acids that are 
"conservative substitutions" for one another. 

Table 2 

Conservative Substitution Groups 


1 

Alanine (A) 

Serine (S) 

Threonine (T) 

2 

Aspartic acid (D) 

Glutamic acid (E) 


3 

Asparagine (N) 

Glutamine (Q) 


4 

Arginine (R) 

Lysine (K) 


5 

Isoleucine (T) 

Leucine (L) 

Methionine (M) Valine (V) 

6 

Phenylalanine (F) 

Tyrosine (Y) 

Tryptophan (W) 


10 

Thus, "conservatively substituted variations" of a listed polypeptide sequence 
of the present invention include substitutions of a small percentage, typically less than 5%, 
more typically less than 4%, 3%, 2% or 1%, of the amino acids of the polypeptide sequence, 
with a conservatively selected amino acid of the same conservative substitution group. 
15 For example, a conservatively substituted variation of the polypeptide 

identified herein as SEQ ID NO: 75 will contain "conservative substitutions," according to 
the six groups defined herein, in up to 9 residues (i.e., 5% of the amino acids) in the 180 
anaino acid polypeptide. 

In a further example, if four conservative substitutions were localized in the 
20 region corresponding to amino acids 1-20 of SEQ ID NO: 75, examples of conservatively 
substituted variations of this region, EHNPV VMVHG IGGAS ENFAG, mclude: 

DHNP V IMVHG MGGAS YNFAG and 

DHQPV VVVHG IGGSS FNFSG 

And the like, in accordance with the conservative substitutions listed in Table 
25 2 (in the above example, conservative substitutions are underlined). Listing of a protein 
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sequence herein, in conjunction with the above substitution table, provides an express listing 
of all conservatively substituted proteins. 

Finally, the addition of sequences which do not alter the encoded activity of a 
nucleic acid molecule, such as the addition of a non-functional sequence, is a conservative 
5 variation of the basic nucleic acid. 

One of skill will appreciate that many conservative variations of the nucleic 
acid constructs which are disclosed yield a functionally identical construct. For example, as 
discussed above, owing to the degeneracy of the genetic code, "silent substitutions" (i.e., 
substitutions in a nucleic acid sequence which do not result in an alteration in an encoded 
10 polypeptide) are an implied feature of every nucleic acid sequence which encodes an amino 
acid. Similarly, "conservative amino acid substitutions," in one or a few amino acids in an 
amino acid sequence are substituted with different amino acids with highly similar 
properties, are also readily identified as being highly similar to a disclosed construct Such 
conservative variations of each disclosed sequence are a feature of the present invention. 

15 Nucleic Acid Hvbridization 

Nucleic acids "hybridize" when they associate, typically in solution. Nucleic 

acids hybridize due to a variety of well characterized physico-chemical forces, such as 

hydrogen bonding, solvent exclusion, base stacking and the like. An extensive guide to the 

hybridization of nucleic acids is found in Tijssen (1993) Laboratory Techniques in 

20 Biochemistry and Molecular Biology—Hybridization with Nucleic Acid Probes, part I, 
chapter 2, "Overview of principles of hybridization and the strategy of nucleic acid probe 
assays," (Elsevier, New York), as well as in Ausubel, supra, Hames and Higgins (1995) 
Gene Probes 1, JRL Press at Oxford University Press, Oxford, England (Hames and Higgins 
1) and Hames and Higgins (1995) Gene Probes 2, IRL Press at Oxford University Press, 

25 Oxford, England (Hames and Higgins 2) provide details on the synthesis, labeling, detection 
and quantification of DNA and RNA, including oligonucleotides. 

"Stringent hybridization wash conditions" iri the context of nucleic acid 
hybridization experiments, such as Southern and northern hybridizations, are sequence 
dependent, and are different under different environmental parameters. An extensive guide 
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to the hybridization of nucleic acids is found in Tijssen (1993), supra^ and in Hames and 
Higgins 1 and Hames and Higgins 2, si4pra. 

For purposes of the present invention, generally, "highly stringent" 
hybridization and wash conditions are selected to be about 5° C or less lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH (as 
noted below, highly stringent conditions can also be referred to in comparative terms). The 
Tm is the temperature (under defined ionic strength and pH) at which 50% of the test 
sequence hybridizes to a perfectly matched probe. Very stringent conditions are selected to 
be equal to the Tm for a particular probe. 

The Tm temperature of the nucleic acid duplexes indicates the temperature at 
which the duplex is 50% denatured under the given conditions and represents a direct 
measure of the stability of the nucleic acid hybrid. Thus, the Tm corresponds to the 
temperature corresponding to the midpoint in transition from helix to random coil; it depends 
on length, nucleotide composition, and ionic strength for long stretches of nucleotides. 

After hybridization, unhybridized nucleic acid material can be removed by a 
series of washes, the stringency of which can be adjusted depending upon the desired results. 
Low stringency washing conditions (e.g., using higher salt and lower temperature) increases 
sensitivity, but can produce nonspecific hybridization signals and high background signals. 
Higher stringency conditions (e.g., using lower salt and higher temperature that is closer to 
the hybridization temperature) lowers the background signal, typically with only the specific 
signal remaining. See^ Rapley, R. and Walker, J.M. eds.. Molecular BiometJwds Handbook 
(Humana Press, Inc. 1998) (hereinafter '^Rapley and Walker")* which is incorporated herein 
by reference in its entirety for all purposes. 

The TmOf a DNA-DNA duplex can be estimated using the following equation: 

Tm (°C) = 81.5°C + 16.6 (logioM) + 0.41 (%G + C) - 0.72 (%f) - 500/n, 
where M is the molarity of the monovalent cations (usually Na+), (%G + C) is the percentage 
of guanosine (G) and cytosine (C) nucleotides, (%f) is the percentage of formamide and n is 
the number of nucleotide bases (i.e., length) of the hybrid. See^ Rapley and Walker, supra. 

The Tm of an RNA-DNA duplex can be estimated as follows: 
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Tm CC) = 79.8^C + 18.5 QogioM) + 0.58 (%G + C) - 11.8(%G + Cf - 0.56 
(%0 - 820/n, where M is the molarity of the monovalent cations (usually Na+), (%G + C) is 
the percentage of guanosine (G) and cytosine (C) nucleotides, (%f) is the percentage of 
formamide and n is the number of nucleotide bases (i.e., length) of the hybrid Id Equations 
5 1 and 2 are typically accurate only for hybrid duplexes longer than about 100-200 
nucleotides. Id 

The Tm of nucleic acid sequences shorter than 50 nucleotides can be 
calculated as follows: 

Tn,(^C) = 4(G + C) + 2(A + T), 
10 where A (adenine), C, T (thymine), and G are the numbers of the corresponding nucleotides. 
An example of stringent hybridization conditions for hybridization of 
complementary nucleic acids which have more than 100 complementary residues on a filter 
in a Southem or northern blot is 50% formamide with 1 mg of heparin at 42°C, with the 
hybridization being carried out overnight. An example of stringent wash conditions is a 0.2x 
15 SSC wash at 65^C for 15 minutes (see Sambrook, supra for a description of SSC buffer). 
Often the high stringency wash is preceded by a low stringency wash to remove background 
probe signal. An example low stringency wash is 2x SSC at 40°C for 15 minutes. 

In general, a signal to noise ratio of 2.5x-5x (or higher) than that observed for 
an unrelated probe in the particular hybridization assay indicates detection of a specific 
20 hybridization. Detection of at least stringent hybridization between two sequences in the 
context of the present invention indicates relatively strong structural similarity or homology 
to, e.g., the nucleic acids of the present invention provided in the sequence listings herein. 
As noted, "highly stringent" conditions are selected to be about 5° C or less 
lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength 
25 and pH. Target sequences that are closely related or identical to the nucleotide sequence of 
interest (e.g., "probe") can be i(fcntified under highly stringency conditions. Lower 
stringency conditions are appropriate for sequences that are less complementary. See, e.g., 
Rapley and Walker, supra. 

Comparative hybridization can be used to identify nucleic acids of the 
30 invention, and this comparative hybridization method is a preferred method of distinguishing 
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nucleic acids of the invention. Detection of highly stringent hybridization between two 
nucleotide sequences in the context of the present invention indicates relatively strong 
structural similarity/homology to, e.g., the nucleic acids provided in the sequence listing 
herein. BBghly stringent hybridization between two nucleotide sequences demonstrates a 
5 degree of similarity or homology of structure, nucleotide base composition, arrangement or 
order that is greater than that detected by stringent hybridization conditions. In particular, 
detection of highly stringent hybridization in the context of the present invention indicates 
strong structural similarity or structural homology (e.g., nucleotide structure, base 
composition, arrangement or order) to, e.g., the nucleic acids provided in the sequence 

10 listings herein. For example, it is desirable to identify test nucleic acids which hybridize to 
the exemplar nucleic acids herein under stringent conditions. 

Thus, one measure of stringent hybridization is the ability to hybridize to one 
of the listed nucleic acids (e.g., nucleic acid sequences SEQ ID NO:l to SEQ ID NO:54, and 
complementary polynucleotide sequences thereof) under highly stringent conditions (or very 

15 stringent conditions, or ultra-high stringency hybridization conditions, or ultra-ultra high 
stringency hybridization conditions). Stringent hybridization (including, e.g., highly 
stringent, ultra-high stringency, or ultra-ultra high stringency hybridization conditions) and 
wash conditions can easily be determined empirically for any test nucleic acid. 

For example, in determining highly stringent hybridization and wash 

20 conditions, the hybridization and wash conditions are gradually increased (e.g., by increasing 
temperature, decreasing salt concentration, increasing detergent concentration and/or 
increasing the concentration of organic solvents, such as formamide, in the hybridization or 
wash), until a selected set of criteria are met. For example, the hybridization and wash 
conditions are gradually increased until a probe comprising one or more nucleic acid 

25 sequences selected from SEQ ID NO:l to SEQ ID NO:54, and complementary 

polynucleotide sequences thereof, binds to a perfectly matched complementary target (again, 
a nucleic acid comprising one or more nucleic acid sequences selected from SEQ ID NO:l to 
SEQ ID NO:54, and complementary polynucleotide sequences thereof), with a signal to noise 
ratio that is at least 2.5x, and optionally 5x or more as high as that observed for hybridization 

30 of the probe to an unmatched target In this case, the unmatched target is a nucleic acid 
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corresponding to, e.g., a known lipase homologue, e.g., a lipase homologue nucleic acid 
(other than those in the accompanying sequence listing) that is present in a public database 
such as GenBank™ at the time of filing of the subject application Examples of such 
unmatched target nucleic acids include, e.g., those represented by or which encode the 
5 following GenBank accession numbers: 1I6WA, 1I6WB, A02813, A02815,A34992, 
AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617, 
AF134840, AF14I874, AF237623, AJ297356, BAA11406, BAA22231, BAB05967, 
C69652, CAA00273, CAA00274, CAA02196, CAA64621, CAB12064, CAB 12664, 
CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, 

10 M74010, P37957, S23934, U78785, X95309, Z99105, and Z99108. It will be appreciated 
that the above GenBank accession numbers represent both anaino acid and nucleic acid 
sequences. In the present application, such sequences should be read in context, e.g., when 
the context indicates an amino acid is to be considered, ttien the accession numbers that 
represent a nucleic acid should be interpreted as their amino acid translations and when the 

15 context indicates that a nucleic acid is intended, then the accession numbers representing 
amino acids should be interpreted as representing their corresponding nucleic acid. 
Additional such sequences can be identified in GenBank by one of skill. 

A test nucleic acid is said to specifically hybridize to a probe nucleic acid 
when it hybridizes at least as well to the probe as to the perfectly matched complementary 

20 target, i.e., with a signal to noise ratio at least as high as hybridizatipn of the probe to the 
target under conditions in which the perfectly matched probe binds to the perfectly matched 
complementary target with a signal to noise ratio that is at least about 5x-10x as high as that 
observed for hybridization to any of the uiunatched target nucleic acids, e.g., represented by 
or which encode the following GenBank accession numbers: 1I6WA, 1I6WB, A02813, 

25 A02815,A34992, AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, 
AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, BAA11406, 
BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196, CAA64621, 
CAB 12064, CAB 12664, CABS 197 1,CAB92662, CAB95850, D78508, E01340,E01903, 
E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, and 

30 Z99108. 
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Ultra high-stringency hybridization and wash conditions aie those in which 
the stringency of hybridization and wash conditions are increased until the signal to noise 
ratio for binding of the probe to the perfectiy matched complementary target nucleic acid is 
at least lOx as high as that observed for hybridization to any of the unmatched target nucleic 
5 acids, e.g., represented by or which encode the following GenBank accession numbers: 
1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, AAC12257, 
AAD30278. AAF40217, AAF63229, AB000617, AF134840, AF141874, AF237623, 
AJ297356, BAA11406, BAA22231, BAB05967, C69652, CAA00273, CAA00274, 
CAA02196, CAA64621, CAB12064, CAB12664, CAB5I971, CAB92662, CAB95850, 
10 D78508, E01340, E01903, E02083, E05047, JW0068, M740I0, P37957, S23934, U78785, 
X95309, Z99105, and Z99108. A target nucleic acid which hybridizes to a probe under such 
conditions, with a signal to noise ratio of at least V^ that of the perfectiy matched 
complementary target nucleic acid is said to bind to the probe under ultra-high stringency 
conditions. 

15 Similarly, even higher levels of stringency can be determined by gradually 

increasing the hybridization and/or wash conditions of the relevant hybridization assay. For 
example, thos6 in which the stringency of hybridization and wash conditions are increased 
until the signal to noise ratio for binding of the probe to the perfectly matched 
complementary target nucleic acid is at least lOx, 20x, 50x, lOOx, or 500x or more as high as 

20 that observed for hybridization to any of the unmatched target nucleic acids represented by or 
which encode the following GenBank accession numbers: 1I6WA, 1I6WB, A02813, 
A02815 A34992, AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, 
AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, BAA11406, 
BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196, CAA64621, 

25 CAB12064, CAB12664, CAB51971, CAB92662. CAB95850, D78508, E01340, E01903, 
E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309. Z99105, and 
Z99108. A target nucleic acid which hybridizes to a probe under such conditions, with a 
signal to noise ratio of at least ¥i that of the perfectly matched complementary target nucleic 
add is said to bind to the probe under ultra-ultra-high stringency conditions. 
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Target nucleic acids which hybridize to the nucleic acids represented by SEQ 
ID N0:1 to SEQ ID NO:54 under high, ultra-high and ultra-ultra high stringency conditions 
are a feature of the invention. Examples of such nucleic acids include those with one or a 
few silent or conservative nucleic acid substitutions as compared to a given nucleic acid 
5 sequence. 

Nucleic acids, such as man-made nucleic acids which do not hybridize to each 
other under stringent conditions are still substantially identical if the polypeptides which they 
encode are substantially identical. This occurs, e.g., when a copy of a nucleic acid is created 
using the maximum codon degeneracy permitted by the genetic code, or when antisera 

10 generated against one or more of SEQ ID NO:55 to SEQ ID NO: 108, which has been 

subtracted using the polypeptides represented by or which encode the following lipase related 
sequences in GenBank: 1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, 
AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617, AF134840, 
AF141874. AF237623, AJ297356, BAA11406, BAA22231, BAB05967, C69652, 

15 CAA00273, CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, 
CAB92662, CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, M74010, 
P37957, S23934, U78785, X95309, Z99105, and Z99108. Further details on immunological 
identification of polypeptides of the invention are found below. 

In one aspect, the invention provides a nucleic acid which comprises a unique 

20 subsequence in a nucleic acid selected from SEQ ID NO:l to SEQ ID NO:54. The unique 
subsequence is unique as compared to a nucleic acid corresponding to any of the sequences 
represented or which encode, e.g., by GenBank accession numbers: 1I6WA, 1I6WB, 
A02813, A02815A34992, AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, 
AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, BAA11406, 

25 BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196, CAA64621, 

CAB 12064, CAB12664, CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, 
E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, and 
Z99108, or related sequences present in GenBank as of the filing of this application. 
Alignment can be performed using the BLAST algorithm set to default parameters. Any 

30 unique subsequence is useful, e.g., as a probe to identify the nucleic acids of the invention. 
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Similarly, the invention includes a polypeptide which comprises a unique 
subsequence in a polypeptide selected from: SEQ ID NO:55 to SEQ ID NO: 108. Here, the 
unique subsequence is unique as compared to a polypeptide corresponding to any of the 
sequences represented by or which encode GenBank accession numbers: 1I6WA, 1I6WB, 
5 A02813, A02815,A34992, AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, 
AAF63229, AB0006I7, AF134840, AF141874, AF237623, AJ297356, BAAI1406, 
BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196, CAA64621, 
CAB12064, CAB12664, CAB5197I, CAB92662, CAB95850, D78508, E0I340, E01903, 
E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, and 

10 Z99108. Such unique subsequences can be detem^iined by aligning any of SEQ ID NO:55 to 
SEQ ID NO:108 against the complete set of polypeptides represented by or which encode 
GenBank accession numbers; 1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, 
AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617, AF134840, 
AF141874, AF237623, AJ297356, BAA11406, BAA22231, BAB05967, C69652, 

15 CAA00273, CAA00274, CAA02196. CAA64621, CAB12064, CAB12664, CAB51971, 
CAB92662, CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, M74010, 
P37957, S23934, U78785, X95309, Z99105, and Z99108 (the control polypeptides) (note 
that where the sequence corresponds to a non-translated sequence such as a pseudo gene, the 
corresponding polypeptide is generated simply by in silico translation of the nucleic acid 

20 sequence into an amino acid sequence, where the reading frame is selected to correspond to 
the reading frame of lipase nucleic acids. 

The invention also provides for target nucleic acids which hybridizes under 
stringent conditions to a unique coding oligonucleotide which encodes a unique subsequence 
in a polypeptide selected from: SEQ ID NO:55 to SEQ ID NO: 108, wherein the unique 

25 subsequence is unique as compared to a polypeptide corresponding to any of the control 
polypeptides (i.e., the above listed GenBank accession numbers). Unique sequences are 
determined as noted above. 

In one example, the stringent conditions are selected such that a perfectly 
complementary oligonucleotide to the coding oligonucleotide hybridizes to the coding 

30 oligonucleotide with at least about a 5-lOx higher signal to noise ratio than for hybridization 
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of the perfectly complementary oligonucleotide to a control nucleic acid corresponding to 
any of the control polypeptides. Conditions can be selected such that higher ratios of signal 
to noise are observed in the particular assay which is used, e.g., about 15x, 20x, 30x, 50x or 
more. In this example, the target nucleic acid hybridizes to the unique coding 
5 oligonucleotide with at least a 2x higher signal to noise ratio (i.e., stringent conditions) as 
compared to hybridization of the control nucleic acid to the coding oligonucleotide. Again, 
higher signal to noise ratios can be selected, e.g., about 5x, lOx, 20x, 30x, 50x or more. The 
particular signal will depend on the label used in the relevant assay, e.g., a fluorescent label, 
a colorimetric label, a radio active label, or the like. 

10 Percent Sequence Identitv - Sequence Similarity 

As noted above, the peptides employed in the subject invention need not be 

identical, but can be substantially identical, to the corresponding sequence of a lipase 

molecule or related molecule. The peptides can be subject to various changes, such as 

insertions, deletions, and substitutions, either conservative or non-conservative, where such 

15 changes might provide for certain advantages in their use. The polypeptides of the invention 
can be modified in a number of ways so long as they comprise a sequence substantially 
identical (as defined below) to a sequence in a Upase molecule. 

Alignment and comparison of relatively short amino acid sequences (less than 
about 30 residues) is typically straightforward. Comparison of longer sequences can require 

20 more sophisticated methods to achieve optimal alignment of two sequences. Optimal 
alignment of sequences for aligning a comparison window can be conducted by the local 
homology algorithm of Smith and Waterman (1981) Adv Appl Math 2:482, by the homology 
alignment algorithm of Needleman and Wunsch (1970) J Mol Biol 48:443, by the search for 
similarity method of Pearson and lipman (1988) Proc Natl Acad Sci USA 85:2444, by 

25 computerized implementations of these algorithms (GAP, BESTFTT, FASTA, and TFASTA 
in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 
Science Dr., Madison, WI), or by inspection, with the best alignment (i.e., resulting in the 
highest percentage of sequence similarity over the comparison window) generated by the 
various methods being selected. 


56 


wo 02/06457 


PCT/USOl/22160 


The tenn sequence identity naeans that two polynucleotide sequences are 
identical (i.e., on a nucleotide-by-nucleotide basis) over a window of comparison. The term 
"percentage of sequence identity" is calculated by comparing two optimally aligned 
sequences over the window of comparison, determining the number of positions at which the 
5 identical residues occur in both sequences to yield the number of matched positions, dividing 
the number of matched positions by the total number of positions in the window of 
comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage 
of sequence identity. 

As applied to polypeptides, the term substantial identity means that two 

10 peptide sequences, when optimally aligned, such as by the programs GAP or BESTFTT using 
default gap weights (described in more detail below), share at least about 70 percent 
sequence identity, or at least about 75 percent sequence identity, frequently at least about 80 
percent sequence identity, often at least about 85 percent sequence identity, preferably at 
least about 90 percent sequence identity, more preferably at least about 95, 96, 97, 98 percent 

15 sequence identity or more (e.g., 99 percent or more sequence identity) over a designated 

comparison window, e.g., of at least 45 contiguous amino acids up to the entire length of the 
polypeptide sequence. Alternatively, parameters are set such that one or more sequences of 
the invention, e.g., SEQ ID NO:55 to SEQ ID NO: 108 are identified by alignment to a query 
sequence selected from among SEQ ID NO:55 to SEQ ID NO: 108, while sequences 

20 corresponding to unrelated polypeptides, e.g., corresponding to GenBank accession ntraibers: 
1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, AAC12257, 
AAD30278, AAF40217. AAF63229, AB000617, AF134840, AF141874, AF237623, 
AJ297356, BAAU406, BAA22231, BAB05967, C69652, CAA00273. CAA00274, 
CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, 

25 D78508, E01340, E01903, E02083, E05047, JW0068, M74010, P37957, S23934, U78785, 
X95309, Z99105, and Z99108, are not identified. 

Preferably, residue positions which are not identical differ by conservative 
amino acid substitutions. Conservative amino acid substitution refras to the 
interchangeability of residues having similar side chains. For example, a group of amino 

30 acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group 
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of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of 
amino acids having amide-containing side chains is asparagine and glutamine; a group of 
amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group 
of amino acids having basic side chains is lysine, arginine, and histidine; and a group of 
5 amino acids having sulfur-containing side chains is cysteine and methionine. Preferred 
conservative amino acids substitution groups are: valine-leucine-isoleucine, phenylalanine- 
tyrosine, lysine-arginine, alanine-valine, and asparagine-glutamine. 

A preferred example of an algorithm that is suitable for determining percent 
sequence identity and sequence similarity is the FASTA algorithm, which is described in 
10 Peareon, W.R. & lipman, D. J., (1988) Proc Natl Acad Sci USA 85:2444. See also, W. R. 
Pearson, (1996) Methods Enzvmolopv 266:227-258. Preferred parameters used in a FASTA 
alignment of DNA sequences to calculate percent identity are optimized, BL50 Matrix 15: -5, 
k-tuple = 2; joining penalty = 40, optimization = 28; gap penalty -12, gap length penalty =-2; 
and width = 16. 

15 Other preferred examples of algorithm that are suitable for determining 

percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 
algorithms, which are described in Altschul et al., (1977) Nuc Acids Res 25:3389-3402 and 
Altschul et al., (1990) JMol Biol 215:403-410, respectively. BLAST and BLAST 2.0 are 
used, with the parameters described herein, to determine percent sequence identity for the 

20 nucleic acids and proteins of the invention. Software for performing BLAST analyses is 
publicly available through the National Center for Biotechnology Information 
(www.ncbi.nlni.nih.gov/). This algorithm involves first identifying high scoring sequence 
pairs (HSPs) by identifying short words of length W in the query sequence, which either 
match or satisfy some positive-valued threshold score T when aligned with a word of the 

25 same length in a database sequence. T is referred to as the neighborhood word score 
threshold (Altschul et al., supra). These initial neighborhood word hits act as seeds for 
initiating searches to find longer HSPs containing them. The word hits are extended in both 
directions along each sequence for as far as the cumulative alignment score can be increased. 
Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward 

30 score for a pair of matching residues; always > 0) and N (penalty score for mismatching 
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residues; always < 0). For amino acid sequences, a scoring matrix is used to calculate the 
cumulative score. Extension of the word hits in each direction is halted when: the cumulative 
alignment score falls off by the quantity X from its maximum achieved value; the cumulative 
score goes to zero or below, due to the accmnulation of one or more negative-scoring residue 
5 alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, 
T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for 
nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, 
M=5, N=-4 and a comparison of both strands. For amino acid sequences, the BLASTP 
program uses as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 

10 scoring matrix {see, Henikoff & Henikoff, (1989) Proc Natt Acad Sci US A 89: 10915) uses 
alignments (B) of 50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

The BLAST algorithm also performs a statistical analysis of the similarity 
between two sequences (see, e.g., Karlin & Altschul, (1993) Proc Natl Acad Sci USA 
90:5873-5787). One measure of similarity provided by the BLAST algorithm is the smallest 

15 sum probability (P(N))> which provides an indication of the probability by which a match 
between two nucleotide or amino acid sequences would occur by chance. For example, a 
nucleic acid is considered similar to a reference sequence if the smallest sum probability in a 
comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more 
preferably less than about 0.01, and most preferably less than about 0.001. 

20 Another example of a useful algorithm is PDJEUP. PBLEUP creates a 

multiple sequence aligrmient from a group of related sequences using progressive, pairwise 
alignments to show relationship and percent sequence identity. It also plots a tree or 
dendogram showing the clustering relationships used to create the alignment. PDJEUP uses a 
simplification of the progressive alignment method of Feng & Doolittle, (1987) J Mol Evol 

25 35:351-360. The method used is similar to the method described by Higgins & Sharp, (1989) 
CABIOS 5:151-153. Tte program can align up to 300 sequences, each of a maximum length 
of 5,000 nucleotides or amino acids. The multiple alignment procedure begins with the 
pairwise alignment of the two most similar sequences, producing a cluster of two aligned 
sequences. This cluster is then aligned to the next most related sequence or cluster of aligned 

30 sequences. Two clusters of sequences are aligned by a simple extension of the pairwise 
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alignment of two individual sequences. The final alignment is achieved by a series of 
progressive, pairwise alignments. The program is run by designating specific sequences and 
their amino acid or' nucleotide coordinates for regions of sequence comparison and by 
designating the program parameters. Using PILEUP, a reference sequence is compared to 

5 other test sequences to determine the percent sequence identity relationship using the 
following parameters: default gap weight (3.00), default gap length weight (0.10), and 
weighted end gaps. PILEUP can be obtained from the GCG sequence analysis software 
package, e.g., version 7.0 (Devereaux et al., (1984) Nuc Acids Res 12:387-395). 

Another preferred example of an algorithm that is suitable for multiple DNA 

10 and amino acid sequence alignments is the CLUSTALW program (Thompson, J. D. et al., 
(1994) Nuc Acids Res 22:4673-4680). CLUSTALW perfonns multiple pairwise 
comparisons between groups of sequences and assembles them into a multiple alignment 
based on homology. Gap open and Gap extension penalties were 10 and 0.05 respectively. 
For amino acid alignments, the BLOSUM algorithm can be used as a protein weight matrix 

15 (Henikoff and Henikoff, (1992) Proc Natl Acad Sci USA 89: 10915-10919). 

It will be understood by one of ordinary skill in the art, that the above 
discussion of search and alignment algorithms also applies to identification and evaluation of 
polynucleotide sequences, with the substitution of query sequences comprising nucleotide 
sequences, and where appropriate, selection of nucleic acid databases. 

20 SUBSTRATES AND FORMATS FOR SEQUENCE RECOMBINATION 

A variety of diversity generating protocols are available and described in the 
art. The procedures can be used separately, and/or in combination to produce one or more 
variants of a nucleic acid or set of nucleic acids, as well variants of encoded proteins. 
Individually and collectively, these procedures provide robust, widely applicable ways of 

25 generating diversified nucleic acids and sets of nucleic acids (including, e.g., nucleic acid 
libraries) useful, e.g., for the engineering or rapid evolution of nucleic acids, proteins, 
pathways, cells and/or organisms with new and/or improved characteristics. 

While distinctions and classifications are made in the course of the ensuing 
discussion for clarity, it will be appreciated that the techniques are often not mutually 
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exclusive. Indeed, the various methods can be used singly or in combination, in parallel or in 
series, to access diverse sequence variants. 

The result of any of the diversity generating procedures described herein can 
be the generation of one or more nucleic acids, which can be selected or screened for nucleic 
5 acids with or which confer desirable properties, or that encode proteins with or which confer 
desirable properties. Following diversification by one or more of the methods herein, or 
otherwise available to one of skill, any nucleic acids that are produced can be selected for a 
desired activity or property, e.g. lipase activity and/or enantioselective lipase activity or 
lipase activity against particular substrates. This can include identifying any activity that can 

10 be detected, for example, in an automated or automatable format, by any of the assays in the 
art, e.g., by any lipase activity assay {see, infra, for examples of lipase activity assays). A 
variety of related (or even unrelated) properties can be evaluated, in serial or in parallel, at 
the discretion of the practitioner. 

Descriptions of a variety of diversity generating procedures for generating 

15 modified nucleic acid sequences encoding lipase homologues are found in the following 

publications and the references cited therein: Soong, N, et al. (2000) "Molecular breeding of 
vimses" Nat Genet 25(4):436-439; Stemmer, et al. (1999) ^Molecular breeding of viruses for 
targeting and other clinical properties" Tumor Targeting 4:1-4; Ness et al. (1999) *T>NA 
Shuffling of subgenomic sequences of subtiUsin" Nature Biotechnologv 17:893-896; Chang 

20 et al. (1999) ''Evolution of a cytokine using DNA family shuffling" Nature Biotechnologv 
17:793-797; MinshuU and Stemmer (1999) 'Trotein evolution by molecular breeding" 
Current Opinion in Chemical Biologv 3:284-290; Christians et al. (1999) *T>irected evolution 
of thymidine kinase for AZT phosphorylation using DNA family shuffling" Nature 
Biotechnologv 17:259-264; Crameri et al. (1998) *T)NA shuffling of a family of genes from 

25 diverse species accelerates directed evolution" Nature 391:288-291; Crameri et al. (1997) 
**Molecular evolution of an arsenate detoxification pathway by DNA shuffling," Nature 
Biotechnologv 15:436^38; Zhang et al. (1997) 'T^irected evolution of an effective 
fucosidase from a galactosidase by DNA shuffling and screening" Proc Natl Acad Sci USA 
94:4504-4509; Patten et al. (1997) "Applications of DNA Shuffling to Pharmaceuticals and 

30 Vaccines" Current Opinion in Biotechnologv 8:724-733; Crameri et al. (1996) "Construction 
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and evolution of antibody-phage libraries by DNA shuffling" Nature Medicine 2:100-103; 
Crameri et al. (1996) 'Improved green fluorescent protein by molecular evolution using 
DNA shuffling" Nature Biotechnology 14:315-319; Gates et al. (1996) "Affinity selective 
isolation of ligands from peptide libraries through display on a lac repressor *headpiece 
5 dimer"' Journal of Molecular Biology 255:373-386; Stemmer (1996) "Sexual PGR and 
Assembly PGR" In: The Encyclopedia of Molecular Biology . VCH Publishers, New York. 
pp.447-457; Crameri and Stemmer (1995) "Combinatorial multiple cassette mutagenesis 
creates all the permutations of mutant and wildtype cassettes" BioTechniques 18:194-195; 
Stemmer et al., (1995) "Single-step assembly of a gene and entire plasmid form large 

10 numbers of oligodeoxyribonucleotides" Gene . 164:49-53; Stenmier (1995) *The Evolution of 
Molecular Computation" Science 270: 1510; Stemmer (1995) "Searching Sequence Space" 
Bio/Technology 13:549-553; Stemmer (1994) "Rapid evolution of a protein in vitro by DNA 
shuffling" Nature 370:389-391 ; and Stemmer (1994) "DNA shuffling by random 
fragmentation and reassembly: In vitro recombination for molecular evolution." ProcNatl 

15 Acad Sci USA 91: 10747-1075 L 

Mutational methods of generating diversity include, for example, site-directed 
mutagenesis (Ling et al. (1997) "Approaches to DNA mutagenesis: an overview" Anal 
Biochem 254(2): 157-178; Dale et al. (1996) "OUgonucleotide-directed random mutagenesis 
using the phosphorothioate method" Methods Mol Biol 57:369-374; Smith (1985) *Tn vitro 

20 mutagenesis" Ann Rev Genet 19:423-462; Botstein & Shortle (1985) "Strategies and 

applications of in vitro mutagenesis" Science 229:1193-1201; Garter (1986) "Site-directed 
mutagenesis" Biochem J 237:1-7; and Kunkel (1987) *The efficiency of oligonucleotide 
directed mutagenesis" in Nucleic Acids & Molecular Biology (Eckstein, F. and lilley, 
D.MJ. eds.. Springer Verlag, Berlin)); mutagenesis using uracil containing templates 

25 (Kimkel (1985) **Rapid and efficient site-specific mutagenesis without phenotypic selection" 
Proc Nad Acad Sci USA 82:488-492; Kunkel et al. (1987) 'Tlapid and efficient site-specific 
mutagenesis without phenotypic selection" Methods in Enzvmol 154, 367-382; and Bass et 
al. (1988) 'Mutant Trp repressors with new DNA-binding specificities" Science 242:240- 
245); oligonucleotide-directed mutagenesis (Methods in Enzvmol 100: 468-500 (1983); 

30 Methods m Enzvmol 154: 329-350 f 1987V, ZoUer & Smith f 1982^ "Oligonucleotide-directed 
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mutagenesis using M13-derived vectors: an efficient and general procedure for the 
production of point mutations in any DNA fragment" Nucleic Acids Res 10:6487-6500; 
ZoUer & Smith (1983) "Oligonucleotide-directed mutagenesis of DNA fragments cloned into 
M13 vectors" Methods inEnzvmol 100:468-500; and Zoller & Smith (1987) 
5 "Oligonucleotide-directed mutagenesis: a simple method using two oligonucleotide primers 
and a single-stranded DNA template" Methods in Enzvmol 154:329-350); phosphorothioate- 
modifiedDNA mutagenesis (Taylor et al. (1985) 'The use of phosphorothioate-modified 
DNA in restriction enzyme reactions to prepare nicked DNA" Nucl. Acids Res. 13: 8749- 
8764; Taylor et al. (1985) *The rapid generation of oligonucleotide-directed mutations at 

10 high frequency using phosphorothioate-modified DNA" Nucl. Acids Res. 13: 8765-8787 
(1985); Nakamaye & Eckstein (1986) 'Inhibition of restriction endonuclease Nci I cleavage 
by phosphorothioate groups and its application to oligonucleotide-directed mutagenesis" 
Nucl Acids Res 14: 9679-9698; Sayers et al. (1988) "Y-T Exonucleases in phosphorothioate- 
based oligonucleotide-directed mutagenesis" Nucl Acids Res 16:791-802; and Sayers et al. 

15 (1988) "Strand specific cleavage of phosphorothioate-contaimng DNA by reaction with 

restriction endonucleases in the presence of ethidium bromide" Nucl Acids Res 16: 803-814); 
mutagenesis using gapped duplex DNA (Kramer et al. (1984) 'The gapped duplex DNA 
approach to oligonucleotide-directed mutation construction" Nucl Acids Res 12: 9441-9456; 
Kramer & Fritz (1987) Methods in Enzvmol "Oligonucleotide-directed construction of 

20 mutations via gapped duplex DNA" 154:350-367; Kramer et al. (1988) "Improved enzymatic 
in vitro reactions in the gapped duplex DNA approach to oligonucleotide-directed 
construction of mutations" Nucl Acids Res 16: 7207; and Fritz et al. (1988) 
"Oligonucleotide-directed constmction of mutations: a gapped duplex DNA procedure 
without enzymatic reactions in vitro" Nucl Acids Res 16: 6987-6999). 

25 Additional suitable methods include point mismatch repair (Kramer et al, 

(1984) *Tomt Mismatch Repair" CeQ 38:879-887), mutagenesis using repair-deficient host 
strains (Carter et al. (1985) "Improved oligonucleotide site-directed mutagenesis using M13 
vectors" Nucl Acids Res 13: 4431-4443; and Carter (1987) 'Improved oligonucleotide- 
directed mutagenesis using M13 vectors" Methods in Enzvmol 154: 382-403), deletion 

30 mutagenesis (Eghtedarzadeh & Henikoff (1986) 'TJse of oligonucleotides to generate large 
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deletions" Nucl Acids Res 14: 5115), restriction-selection and restriction-purification (Wells 
et al. (1986) "Importance of hydrogen-bond foimation in stabilizing the transition state of 
subtilisin" Phil. Trans. R. Soc. Lond. A 317: 415-423), mutagenesis by total gene synthesis 
(Nambiar et al. (1984) 'Total synthesis and cloning of a gene coding for the ribonuclease S 
5 protein" Science 223: 1299-1301; Sakamar and Khorana (1988) **Total synthesis and 
expression of a gene for the a-subunit of bovine rod outer segment guanine nucleotide- 
binding protein (transducin)" Nucl Acids Res 14: 6361-6372; Wells et al. (1985) "Cassette 
mutagenesis: an efficient method for generation of multiple mutations at defined sites" Gene 
34:315-323; and Grundstr5m et al. (1985) "Oligonucleotide-directed mutagenesis by 

10 microscale *shot-gun* gene synthesis" Nucl Acids Res 13: 3305-3316), double-strand break 
repair (Mandecki (1986) "Oligonucleotide-directed double-strand break repair in plasmids of 
Escherichia coli: a method for site-specific mutagenesis" Proc Natl Acad Sci USA . 83:7177- 
7181; and Arnold (1993) "Protein engineering for unusual environments" Current Opinion in 
Biotechnology 4:450-455). Additional details on many of the above methods can be found in 

15 Methods in Enzvmology Volume 154, which also describes useful controls for trouble- 
shooting problems with various mutagenesis methods. 

Additional details regarding various diversity generating methods can be 
found in the following U.S. patents, PCX publications and applications, and EPO 
publications: U.S. Pat. No. 5,605,793 to Stemmer (February 25, 1997), "Methods for In Vitro 

20 Recombination;" U.S. Pat. No. 5,81 1 .238 to Stemmer et al. (September 22, 1998) 'Methods 
for Generating Polynucleotides having Desired Characteristics by Iterative Selection and 
Recombination;" U.S. Pat. No. 5,830,721 to Stemmer et al. OSfovember 3, 1998), 'TDNA 
Mutagenesis by Random Fragmentation and Reassembly;" U.S. Pat No. 5,834,252 to 
Stemmer, et al. (November 10, 1998) 'TEnd-Complementary Polymerase Reaction;" U.S. Pat 

25 No. 5,837,458 to Minshull, et al. (November 17, 1998), ^Methods and Compositions for 
Cellular and Metabolic Engineering;" WO 95/22625, Stemmer and Crameri, 'Mutagenesis 
by Random Fragmentation and Reassembly;" WO 96/33207 by Stemmer and Lipschutz **End 
Complementary Polymerase Chain Reaction;" WO 97/20078 by Stemmer and Crameri 
'Methods for Generating Polynucleotides having Desired Characteristics by Iterative 

30 Selection and Recombination;" WO 97/35966 by Minshull and Stemmer, 'Methods and 
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Compositions for Cellular and Metabolic Engineering;" WO 99/41402 by Punnonen et al. 
'Targeting of Genetic Vaccine Vectors;" WO 99/41383 by Punnonen et al. "Antigen library 
Immunization;" WO 99/41369 by Punnonen et al. "Genetic Vaccine Vector Engineering;" 
WO 99/41368 by Punnonen et al. "Optimization of Immunomodulatory Properties of Genetic 

5 Vaccines;" EP 752008 by Stemmer and Crameri, 'TDNA Mutagenesis by Random 

Fragmentation and Reassembly;" EP 0932670 by Stemmer "Evolving Cellular DNA Uptake 
by Recursive Sequence Recombination;" WO 99/23107 by Stenmier et al., 'Modification of 
Vims Tropism and Host Range by Viral Genome Shuffling;" WO 99/21979 by Apt et al., 
"Human Papillomavirus Vectors;" WO 98/31837 by del Cardayre et al. **Evolution of Whole 

10 Cells and Organisms by Recursive Sequence Recombination;" WO 98/27230 by Patten and 
Stenmier, "Methods and Compositions for Polypeptide Engineering;" WO 98/27230 by 
Stemmer et al,, "Methods for Optimization of Gene Therapy by Recursive Sequence 
Shuffling and Selection," WO 00/00632, "Methods for Generating Highly Diverse 
Libraries," WO 00/09679, 'Methods for Obtaining in Vitro Recombined Polynucleotide 

15 Sequence Banks and Resulting Sequences," WO 98/42832 by Arnold et al., ''Recombination 
of Polynucleotide Sequences Using Random or Defined Primers," WO 99/29902 by Arnold 
et al., 'Method for Creating Polynucleotide and Polypeptide Sequences," WO 98/41653 by 
Vind, "An in Vitro Method for Construction of a DNA Library," WO 98/41622 by Borchert 
et al., "Method for Constructing a Library Using DNA Shuffling," and WO 98/42727 by Pati 

20 and Zarling, "Sequence Alterations using Homologous Recombination;" WO 00/18906 by 
Patten et al., "Shuffling of Codon-Altei^ Genes;" WO 00/04190 by del Cardayre et al. 
"Evolution of Whole Cells and Organisms by Recursive Recombination;" WO 00/42561 by 
Crameri et al., "Oligonucleotide Mediated Nucleic Acid Recombination;" WO 00/42559 by 
Selifonov and Stenamer "Methods of Populating Data Structures for Use in Evolutionary 

25 Simulations;" WO 00/42560 by Selifonov et al., "Methods for Making Character Strings, 
Polynucleotides & Polypeptides Having Desired Characteristics;" WO 01/23401 by Welch et 
al., "Use of Codon- Varied Oligonucleotide Syntiiesis for Synthetic Shuffling;" and 
PCT/USOl/06775 "Single-Stranded Nucleic Acid Template-Mediated Recombination and 
Nucleic Acid Fragment Isolation" by Affholter. 
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In brief, several different general classes of sequence modification methods, 
such as mutation, recombination, etc. are applicable to the production of lipase homologue 
nucleic adds encoding polypeptides with desired properties, and are set forth, e.g., in the 
references above. The following exempli^ some of the different types of preferred formats 
5 for diversity generation in the context of the present invention, including, e.g., certain 
recombination based diversity generation formats. 

Nucleic acids can be recombined in vitro by any of a variety of techniques 
discussed in the references above, including e.g., DNAse digestion of nucleic acids to be 
recombined followed by ligation and/or PGR reassembly of the nucleic acids. For example, 

10 sexual PGR mutagenesis can be used in which random (or pseudo random, or even non- 
random) fragmentation of the DNA molecule is followed by recombination, based on 
sequence similarity, between DNA molecules with different but related DNA sequences, in 
vitro, followed by fixation of the crossover by extension in a polymerase chain reaction. 
This process and many process variants is described in several of the references above, e.g., 

15 in Stemmer f 19941 Proc Natl Acad Sci USA 91: 10747-10751. Thus, one or more in vitro 
recombination procedure can be employed to generate a diverse set of lipase nucleic acids 
suitable for evaluation in any of a variety of assays designed to identify lipase nucleic acids 
encoding lipase polypeptides with desired properties. See, e.g., the lipase activity assays 
described infra, 

20 Similarly, nucleic acids can be recursively recombined in vivo, e.g., by 

allowing recombination to occur between nucleic acids in cells. Many such in vivo 
recombination formats are set forth in the references noted above. Such formats optionally 
provide direct recombination between nucleic acids of interest, or provide recombination 
between vectors, viruses, plasmids, etc., comprising the nucleic acids of interest, as well as 

25 other formats. Details regarding such procedures are found in the references noted above. 
Hius, lipase nucleic acids can also be diversified in vivo prior to, or in concert with, 
screening and/or selection procedures to identify lipase polypeptides wifli desired properties. 

Whole genome recombination methods can also be used in which whole 
genomes of cells or other organisms are recombined, optionally including spiking of the 

30 genomic reconabination mixtures with desired library components (e.g., genes corresponding 
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to the pathways of the present invention). These methods have many applications, including 
those in which the identity of a target gene is not known. Details on such methods are found, 
e.g., in WO 98/31837 by del Cardayre et al. '"Evolution of Whole Cells and Organisms by 
Recursive Sequence Recombination;" and in, e.g., WO 00/04190 by del Cardayre et al., also 
5 entitled "Evolution of Whole Cells and Organisms by Recursive Sequence Recombination." 

Synthetic recombination methods can also be used, in which oligonucleotides 
corresponding to targets of interest are synthesized and reassembled in PCR or ligation 
reactions which include oligonucleotides which correspond to more than one parental nucleic 
acid, thereby generating new recombined nucleic acids. Oligonucleotides can be made by 

10 standard nucleotide addition methods, or can be made, e.g., by tri-nucleotide synthetic 
approaches. Details regarding such approaches are found in the references noted above, 
including, e.g., WO 00/42561 by Crameri et al., "Oligonucleotide Mediated Nucleic Acid 
Recombination;" WO 01/23401 by Welch et al., 'TJse of Codon-Varied Oligonucleotide 
Synthesis for Synthetic Shuffling;" WO 00/42560 by Selifonov et al., "Methods for Making 

15 Character Strings, Polynucleotides and Polypeptides Having Desired Characteristics;" and 
WO 00/42559 by Selifonov and Stemmer ^Methods of Populating Data Structures for Use in 
Evolutionary Simulations." 

In silico methods of recombination can be effected in which genetic 
algorithms are used in a computer to recombine sequence strings which correspond to 

20 homologous (or even non-homologous) nucleic acids. The resulting recombined sequence 
strings are optionally converted into nucleic acids by synthesis of nucleic acids which 
correspond to the recombined sequences, e.g., in concert with oligonucleotide synthesis/gene 
reassembly techniques. This approach can generate random, partially random, or designed 
variants. Many details regarding in silico recombination, including the use of genetic 

25 algorithms, genetic operators and the like in computer systems, combined with generation of 
corresponding nucleic acids (and/or proteins), as well as combinations of designed nucleic 
acids and/or proteins (e.g., based on cross-over site selection) as well as designed, pseudo- 
random, or random recombination methods are described in WO 00/42560 by Selifonov et 
al., 'Methods for Making Character Strings, Polynucleotides and Polypeptides Having 

30 Desired Characteristics" and WO 00/42559 by Selifonov and Stemmer "Methods of 
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Populating Data Structures for Use in Evolutionary Simulations." Extensive details 
regarding in silico recombination methods are found in these applications. This methodology 
is generally applicable to the present invention in providing for recombination of character 
strings corresponding to lipase homologues in silico and/ or the generation of corresponding 
5 nucleic acids or proteins. 

Many methods of accessing natural diversity, e.g., by hybridization of diverse 
nucleic acids or nucleic acid fragments to single-stranded templates, followed by 
polymerization and/or ligation to regenerate full-length sequences, optionally followed by 
degradation of the templates and recovery of the resulting modified nucleic acids can be 

10 similarly used. In one method employing a single-stranded template, the fragment 
population derived from the genomic library(ies) is annealed with partial, or, often 
approximately full length ssDNA or RNA corresponding to the opposite strand. Assembly of 
complex chimeric genes from this population is then mediated by nuclease-base removal of 
non-hybridizing fragment ends, polymerization to fill gaps between such fragments and 

15 subsequent single stranded ligation. The parental polyhucleotide strand can be removed by 
digestion (e.g., if RNA or uracil-containing), magnetic separation under denaturing 
conditions (if labeled in a maimer conducive to such separation) and other available 
separation/purification methods. Alternatively, the parental strand is optionally co-purified 
with the chimeric strands and removed during subsequent screening and processing steps. 

20 Additional details regarding this approach arc found, e.g., in "Single-Stranded Nucleic Acid 
Template-Mediated Recombination and Nucleic Acid Fragment Isolation" by Affholter, 
PCT/USOl/06775. 

In another approach, single-stranded molecules are converted to double- 
stranded DNA (dsDNA) and the dsDNA molecules are bound to a solid support by ligand- 

25 mediated binding. After separation of unbound DNA, the selected DNA molecules are 
released from the support and introduced into a suitable host cell to generate a library 
enriched sequences which hybridize to the probe. A library produced in this manner 
provides a desirable substrate for further diversification using any of the procedures 
described herein. 
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Any of the preceding general recombination formats can be practiced in a 
reiterative fashion (e.g., one or more cycles of mutation/recombination or other diversity 
generation methods, optionally followed by one or more selection methods) to generate a 
more diverse set of recombinant nucleic acids. 
5 Mutagenesis employing polynucleotide chain termination methods have also 

been proposed {see e.g., U.S. Patent No. 5,965,408, 'Method of DNA reassembly by 
interrupting synthesis" to Short, and the references above), and can be applied to the present 
invention. In this approach, double stranded DNAs corresponding to one or more genes 
sharing regions of sequence similarity are combined and denatured, in the presence or 

10 absence of primers specific for the gene. The single stranded polynucleotides are then 

annealed and incubated in the presence of a polymerase and a chain terminating reagent (e.g., 
ultraviolet, gamma or X-ray irradiation; ethidium bromide or other intercalators; DNA 
binding proteins, such as single strand binding proteins, transcription activating factors, or 
histones; polycyclic aromatic hydrocarbons; trivalent chromium or a trivalent chromium salt; 

15 . or abbreviated polymerization mediated by rapid thermocycling; and the like), resulting in 
the production of partial duplex molecules. The partial duplex molecules, e.g., containing 
partially extended chains, are then denatured and reannealed in subsequent rounds of 
replication or partial replication resulting in polynucleotides which share varying degrees of 
sequence similarity and which are diversified with respect to the starting population of DNA 

20 molecules. Optionally, the products, or partial pools of the products, can be amplified at one 
or more stages in the process. Polynucleotides produced by a chain termination method, such 
as described above, are suitable substrates for any other described recombination fonnat. 

Diversity also can be gmerated in nucleic acids or populations of nucleic 
acids using a recombinational procedure termed "incremental truncation for the creation of 

25 hybrid enzymes" CTTCHVO described in Ostermeier et al. (1999) "A combinatorial 

approach to hybrid enzymes independent of DNA homology" Nature Biotech 17:1205. This 
approach can be used to generate an initial library of variants which can optionally serve as a 
substrate for one or more in vitro or in vivo recombination methods. See aho, Ostermeier et 
al. (1999) "Combinatorial Protein Engineering by Incremental Truncation," Proc Nad Acad 
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Sci USA , 96: 3562-67; Ostermeier et al. (1999), "Incremental Truncation as a Strategy in the 
Engineering of Novel Biocatalysts," Biological and Medicinal Chemistry. 7: 2139-44. 

Mutational methods which result in the alteration of individual nucleotides or 
groups of contiguous or non-contiguous nucleotides can be favorably employed to introduce 
5 nucleotide diversity. For example, mutagenesis procedures resulting in changes of one or 
more nucleotide can be used to produce any number of lipase variants of the present 
invention. Many mutagenesis methods are found in the above-cited references; additional 
details regarding mutagenesis methods can be found in following, which can also be applied 
to the present invention. 

10 For example, error-prone PGR can be used to generate nucleic acid variants. 

Using this technique, PGR is performed under conditions where the copying fidelity of the 
DNA polymerase is low, such that a high rate of point mutations is obtained along the entire 
length of the PGR product. Examples of such techniques are found in the references above 
and, e.g., in Leung et al. (1989V Technique 1:11-15 and Galdwell et al. (1992) PCRMetiiods 

15 Applic 2:28-33. Similarly, assembly PGR can be used, in a process which involves the 
assembly of a PGR product from a mixture of small DNA fragments. A large number of 
different PGR reactions can occur in parallel in the same reaction mixture, with the products 
of one reaction priming the products of another reaction. 

Oligonucleotide directed mutagenesis can be used to introduce site-specific 

20 mutations in a nucleic acid sequence of interest. Examples of such techniques are found in 
the references above and, e.g., in Reidhaar-Olson et al. (1988) Science, 241:53-57. 
Similarly, cassette mutagenesis can be used in a process that replaces a small region of a 
double stranded DNA molecule with a synthetic oligonucleotide cassette that differs from the 
native sequence. The oligonucleotide can contain, e.g., completely and/or partially 

25 randomized niative sequence(s). 

Recursive ensemble mutagenesis is a process in which an algorithm for 
protein mutagenesis is used to produce diverse populations of phenotypically related mutants, 
members of which differ in amino acid sequence. This method uses a feedback mechanism 
to monitor successive rounds of combinatorial cassette mutagenesis. Examples of this 

30 approach are found m Arkin & Youvan (1992) Proc Nad Acad Sd USA 89:7811-7815. 
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Exponential ensemble mutagenesis can be used for generating combinatorial 
libraries with a high percentage of unique and functional mutants. Small groups of residues 
in a sequence of interest are randomized in parallel to identify, at each altered position, 
an^no acids which lead to functional proteins. Examples of such procedures are found in 
5 Delegrave & Youvan (1993) Biotechnology Research 1 1 : 1548-1552, 

In vivo mutagenesis can be used to generate random mutations in any cloned 
DNA of interest by propagating the DNA, e.g., in a strain of E, coli that carries mutations in 
one or more of the DNA repair pathways. These "mutator" strains have a higher random 
mutation rate than that of a wild-type parent Propagating the DNA in one of these strains 

10 will eventually generate random mutations within the DNA. Such procedures are described 
in the references noted above. 

Other procedures for introducing diversity into a genome, e.g. a bacterial, 
fungal, animal or plant genome can be used in conjunction with the above described and/or 
referenced methods. For example, in addition to the methods above, techniques have been 

15 proposed which produce nucleic acid multimers suitable for transformation into a variety of 
species {see, e.g., Schellenberger U.S. Patent No. 5,756,316 and the references above). 
Transformation of a suitable host with such multimers, consisting of genes that are divergent 
with respect to one another, (e.g., derived from natural diversity or through application of site 
directed mutagenesis, error prone PGR, passage through mutagenic bacterial strains, and the 

20 like), provides a source of nucleic acid diversity for DNA diversification, e.g., by an in vivo 
recombination process as indicated above. 

Alternatively, a multiplicity of monomeric polynucleotides sharing regions of 
partial sequence similarity can be transformed into a host species and recombined in vivo by 
the host cell. Subsequent rounds of cell division can be used to generate libraries, members 

25 of which, include a single, homogenous population, or pool of monomeric polynucleotides. 
Alternatively, the monomeric nucleic acid can be recovered by standard techniques, e.g., 
PGR and/or cloning, and recombined in any of the recombination formats, including 
recursive recombination formats, described above. 

Methods for generating multispecies expression libraries have been described 

30 (in addition to the reference noted above, see^ e.g., Peterson et al. (1998) U.S. Pat No. 
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5,783,431 "Methods for Generating and Screening Novel Metabolic Pathways," and 
Thompson, et al. (1998) U.S. Pat No. 5,824,485 Methods for Generating and Screening 
Novel Metabolic Pathways) and their use to identify protein activities of interest has been 
proposed Qn addition to the references noted above, see. Short (1999) U.S. Pat No. 
5 5,958,672 "Protein Activity Screening of Clones Having DNA from Uncultivated 

Microorganisms"). Multispecies expression libraries include, in general, libraries comprising 
cDNA or genomic sequences from a plurality of species or strains, operably linked to 
appropriate regulatory sequences, in an expression cassette. The cDNA and/or genomic 
sequences are optionally randomly ligated to further enhance diversity. The vector can be a 
10 shuttle vector suitable for transformation and expression in more than one species of host 
organism, e.g., bacterial species, eukaryotic cells. In some cases, the library is biased by 
preselecting sequences which encode a protein of interest, or which hybridize to a nucleic 
acid of interest Any such libraries can be provided as substrates for any of the methods 
herein described. 

15 The above described procedures have been largely directed to increasing 

nucleic acid and/ or encoded protein diversity. However, in many cases, not dl of the 
diversity is useful, e.g., functional, and contributes merely to increasing the background of 
variants that must be screened or selected to identify the few favorable variants. In some 
applications, it is desirable to preselect or prescreen libraries (e.g., an amplified library, a 

20 genomic library, a cDNA library, a normalized library, etc.) or other substrate nucleic acids 
prior to diversification, e.g., by recombination-based mutagenesis procedures, or to otherwise 
bias the substrates towards nucleic acids that encode functional products. For example, in the 
case of antibody engineering, it is possible to bias the diversity generating process toward 
antibodies with functional antigen binding sites by taking advantage of in vivo recombination 

25 events prior to manipulation by any of the described methods. For example, recombined 
CDRs derived from B cell cDNA libraries can be amplified and assembled into framework 
regions (e.g., Jirholt et al. (1998) "Exploiting sequence space: shuffling in vivo formed 
complementarity determining regions into a master framework" Gene 215: 471) prior to 
diversifying according to any of the methods described herein. 
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Libraries can be biased towards nucleic acids which encode proteins with 
desirable enzyme activities. For example, after identifying a clone from a library which 
exhibits a specified activity, the clone can be mutagenized using any known method for 
introducing DNA alterations. A library comprising the mutagenized homologues is then 
5 screened for a desired activity, which can be the same as or different from the initially 

specified activity. An example of such a procedure is proposed in Short (1999) U.S. Patent 
No. 5,939,250 for '^Production of Enzymes Having Desired Activities by Mutagenesis." 
Desired activities can be identified by any method known in the art. For example, WO 
99/10539 proposes that gene libraries can be screened by combining extracts from the gene 

10 library with components obtained from metabolically rich cells and identifying combinations 
which exhibit the desired activity. It has also been proposed (e.g., WO 98/58085) that clones 
with desired activities can be identifiod by inserting bioactive substrates into samples of the 
library, and detecting bioactive fluorescence corresponding to the product of a desired 
activity using a fluorescent analyzer, e.g., a flow cytometry device, a CCD, a fluorometer, or 

15 a spectrophotometer. 

Libraries can also be biased towards nucleic acids which have specified 
characteristics, e.g., hybridization to a selected nucleic acid probe. For example, application 
WO 99/10539 proposes that polynucleotides encoding a desired activity (e.g., an enzymatic 
activity, for example: a lipase, an esterase, a protease, a glycosidase, a glycosyl transferase, a 

20 phosphatase, a kinase, an oxygenase, a peroxidase, a hydrolase, a hydratase, a nitrilase, a 
transaminase, an amidase or an acylase) can be identified from among genomic DNA 
sequences in the following manner. Single stranded DNA molecules from a population of 
genomic DNA are hybridized to a ligand-conjugated probe. Hie genomic DNA can be 
derived from either a cultivated or uncultivated microorganism, or firom an environmental 

25 sample. Alternatively, the genomic DNA can be derived firom a multi-cellular organism, or a 
tissue derived therefix>m. Second strand synthesis can be conducted directly fix)m the 
hybridization probe used in the capture, with or without prior release from the capture 
medium or by a wide variety of other strategies known in die art Alternatively, the isolated 
single-stranded genomic DNA population can be Segmented without further cloning and 
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used directly in, e.g., a recombination-based approach, that employs a single-stranded 
template, as described above. 

'*Non-Stochastic" methods of generating nucleic acids and polypeptides are 
alleged in Short "Non-Stochastic Generation of Genetic Vaccines and Enzymes" WO 
5 00/46344. These methods, including proposed non-stochastic polynucleotide reassembly 
and site-saturation mutagenesis methods can be applied to the present invention as well. 
Random or semi-random mutagenesis using doped or degenerate oligonucleotides is also 
described in, e.g., Arkin and Youvan (1992) "Optimizing nucleotide mixtures to encode 
specific subsets of amino acids for semi-random mutagenesis" Biotechnologv 10:297-300; 

10 Reidhaar-Olson et al. (1991) "Random mutagenesis of protein sequences using 

oligonucleotide cassettes" Methods Enzvmol 208:564-86; Lim and Sauer (1991) "The role 
of internal packing interactions in determining the structure and stability of a protein" JMol 
Biol 219:359-76; Breyer and Sauer (1989) "Mutational analysis of Hie fine specificity of 
binding of monoclonal antibody 51F to lambda repressor" J Biol Chem 264:13355-60); and 

15 "Walk-Through Mutagenesis" (Crea, Tl; US Patents 5,830,650 and 5,798,208, and EP Patent 
0527809 Bl. 

It will readily be appreciated that any of the above described techniques 
suitable for enriching a library prior to diversification can also be used to screen the products, 
or libraries of products, produced by the diversity generating methods. 

20 Kits for mutagenesis, Ubrary construction and other diversity generation 

methods are also commercially available. For example, kits are available from, e.g., 
Stratagene (e.g., QuickChange™ site-directed mutagenesis Mt; and Chameleon™ double- 
stranded, site-directed mutagenesis kit), Bio/Can Scientific, Bio-Rad (e.g., using the Kunkel 
method described above), Boehringer Mannheim Corp., Clonetech Laboratories, DNA 

25 Technologies, Epicentre Technologies (e.g., 5 prime 3 prime kit); Genpak Inc, Lemargo Inc, 
Life Technologies (Gibco BRL), New England Biolabs, Pharmacia Biotech, Promega Corp., 
Quantum Biotechnologies, Amersham Ihtemational pic (e.g., using the Eckstein method 
above), and Anglian Biotechnology Ltd. (e.g., using the Carter/Winter method above). 
The above references provide many mutational formats, including 

30 recombination, recursive recombination, recursive mutation and combinations or 
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recombination with other forms of mutagenesis, as well as many modifications of these 
formats. Regardless of the diversity generation format that is used, the nucleic acids of the 
invention can be recombined (with each other, or with related (or even unrelated) sequences) 
to produce a diverse set of recombinant nucleic acids, including, e.g., sets of homologous 
5 nucleic acids, as well as corresponding polypeptides. 

The current invention provides methods of producing modified or 
recombinant nucleic acids comprising mutating or recombining (including recursive 
recombination with one or more additional nucleic acid) a nucleic add of die invention (or a 
fragment thereof), as well as the modified or recombinant nucleic acids that are produced by 

10 such method. The method optionally includes wherein the one or more additional nucleic 
acid encodes a polypeptide comprising lipase activity and/or enantioselective lipase activity 
(or an amino acid subsequence or fragment thereof). The recombination (e.g., recursive 
recombination) is optionally done in vitro or in vivo and optionally produces at least one 
library of recombinant nucleic acids, which comprises at least one polypeptide comprising 

15 lipase activity and/or enantioselective lipase activity (or a homologue thereof). Both the 

nucleic acid library produced and a population of cells comprising the library are provided by 
the invention, as are the modified or recombinant nucleic acids produced by the 
mutation/recombination and the cells which comprise such nucleic acids. The invention also 
includes a method of producing a polypeptide by introducing a nucleic acid of the invention 

20 (or a fragment thereof), which is operably linked to a regulatory sequence capable of 
directing expression of such nucleic acid into a polypeptide in at least a subset of a 
population of cells or their progeny and then expressing the polypeptide in the subset of the 
population (or their progeny). Hie polypeptide produced from such method is also part of 
the current invention. Such method optionally includes isolating the polypeptide from the 

25 cells and optionaUy includes expressing the polypeptide by culturing the population (or 
subset) in a nutrient medium under conditions where the regulatory sequence directs 
expression of the polypeptide encoded by ttie nucleic acid (again, wherein the polypeptide is 
optionally isolated or recovered from the cells and/or from the nutrient media (such culturing 
is optionally done in a bulk fermentation vessel). The cells used in such methods are 

30 optionally bacterial, eukaryotic (e.g., fungal cells, yeast cells, plant cells, insect cells, or 
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mammalian cells (e.g., fertilized oocytes, embryonic stem cells, pluripotent stem cells, etc.)). 
If mammalian cells are utilized, a transgenic animal is optionally regenerated from the cells 
and the polypeptide is optionally recovered from the transgenic animal or from a by-product 
of the transgenic animal such as milk. 

5 fflGH THROUGHPUT SCREENING 

High throughput screening formats are typically those formats which enable 
the efficient evaluation of a large number of samples, such as are associated with a hbrary of 
nucleic add or polypeptide sequences. Typically, a high throughput screening assay enables 
the evaluation of greater than 100, more conamonly greater than 500, often greater than 1000 

10 or more samples in an efficient manner. A number of types of assays can be adapted to a 

high throughput format. For example, the throughput associated with a nucleic hybridization 
assay can be increased by adapting the assay from, e.g., electrophoretic separation of the 
subject nucleic acids followed by transfer to a nylon or nitrocellulose membrane and 
subsequent hybridization, to a "dot blot" format based on direct application of the subject 

15 nucleic acids to a membrane in an array with subsequent hybridization to a probe. The 

throughput can be further increased by robotic assistance, e.g., of the nucleic acid preparation 
and/or membrane application steps of the procedure. Similarly, many cell based assays can 
be reduced in scale, and increased in processing efficiency. 

In addition to the nucleic acid screening methods indicated above, high 

20 throughput assays are used in the context of the present invention to measure functional 

activity of the nucleic acid and polypeptides described herein. One common format for cell 
based screening assays in a high throughput format is the multiwell microtiter plate although 
other formats are also suitably adapted to the present invention (e.g. , microfluidic devices 
such as the HP/Agilerit Technologies HP2100 and die Caliper HTS system: Caliper 

25 Technologies, Mountain View, California). 

Standard microtiter plates are available with 96, 384 or 1536 wells, although 
even higher numbers of wells are also available. Well construction and materials can be 
selected according to the precise application. For example, well dimensions vary in shape, 
cross sectional area, depth and volume the choice of which can be influenced by such 

30 parameters as minimizing reagent use, or maximizing product recovery. Conmion materials 
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include a myriad of plastics, including polystyrene, polypropylene and the like. For some 
cell culture applications, it is desirable to use microtiter plates that have been pre-tceated with 
agents that improve cell adherence or survival, e.g., poly-lysine, gelatin, etc. 

Typically the plate dimensions are selected for compatibility with robotic 
5 loading and handling devices. Suitable robotic plate handling devices include, e.g., 

Multimek from Beckman Coulter, Q-BOTDI from Genetix; and the BioRobot #9600/9604 
from Qiagen. 

OTHER POLYNUCLEOTIDE COMPOSITIONS 

The invention also includes compositions comprising two or more 

10 polynucleotides of the invention (e.g., as substrates for recombination). The composition can 
comprise a library of recombinant nucleic acids, where the library contains at least 2, 3, 5, 
10, 20, 50, 100, 1,000 or 5,000 or more nucleic acids. The nucleic acids are optionally 
cloned into expression vectors, providing expression Ubraries. 

The invention also includes compositions produced by digesting one or more 

15 polynucleotide of the invention with a restriction endonuclease, an RNAse, or a DNAse (e.g., 
as is performed in certain of the recombination formats noted above); and compositions 
produced by fragmenting or shearing one or more polynucleotide of the invention by 
mechanical means (e.g., sonication, vortexing, and the like), or by chemical cleavage (e.g., 
by incorporating nucleotide analogues subject to, e.g., photo-activated or other cleavage) 

20 which can also be used to provide substrates for recombination in the methods above. 

Similarly, compositions comprising sets of oligonucleotides corresponding to more than one 
nucleic acid of the invention are useful as recombination substrates and are a feature of the 
invention. iFor convenience, these fragmented, sheared, or synthesized oligonucleotide 
mixtures are referred to as fragmented nucleic acid sets. 

25 Also included in the invention are compositions produced by incubating one 

or more of the fragmented nucleic acid sets in the presence of ribonucleotide- or 
deoxyribonucleotide triphosphates and a nucleic acid polymerase. This resulting 
composition forms a recombination mixture for many of the recombination formats noted 
above. Hie nucleic acid polymerase may be an RNA polymerase, a DNA polymerase, or an 
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RNA-dixected DNA polymerase (e.g., a "reverse transcriptase"); the polymerase can be, e.g., 
a thermostable DNA polymerase (such as, VENT, TAQ, or the like). 

LIPASE HOMOLOGUE POLYPEPTIDES 

The invention provides isolated or recombinant lipase homologue 
5 polypeptides, referred to herein as "novel lipase polypeptides," "lipase homologue 

polypeptides," "lipase homologues," or simply "novel lipases." For example, an isolated or 
recombinant lipase homologue polypeptide of the invention includes a polypeptide 
comprising a sequence selected from SEQ ID NO: 55 to SEQ ID NO: 108, and 
conservatively modified variations thereof (as well as a fi:agment of such, which fragment 

10 can comprise lipase activity and/or enantioselective lipase activity) Additionally, the 

invention provides a polypeptide encoded by a polynucleotide sequence selected from SEQ 
ID NO: 1 through SEQ ID NO: 54 or a complementary polynucleotide sequence thereof, etc. 
Alignments of both nucleic acid and amino acid exemplary Upase homologue polypeptide 
sequences (for both newly isolated homologues and for newly created homologues) 

15 according to the invention are provided in Figures 3 through 6. Figure 3 depicts an 

alignment of exemplary novel lipase polynucleotides of the invention (SEQ ID NOS:1-20). 
The predicted boundary between the signal peptide coding region and the mature coding 
region is indicated by the arrow. Thus, a mature coding region or mature polypeptide, either 
as a polypeptide or as its encoding nucleic acid, of the invention comprises such an area as is 

20 delineated in, e.g.. Figure 3, i.e., it does not include signal peptide regions, introductory 
5 'regions or tailing 3' regions such as a TGA stop, etc. Figure 4 depicts an alignment of 
exemplary novel Lipase polynucleotides of the invention (SEQ ID NOS:21-54). The 
nucleotide sequences depicted in the figure represent predicted mature coding regions, each 
with an introductory 5' *T' just prior to the start of the mature coding region, and ending with 

25 a 3* *TGA" stop codon. Figure 5 depicts an aUgnment of exemplary novel lipase 

polypeptides of the invention (SEQ ID NOS:55-74). The predicted boundary between the 
signal peptide and the mature region is indicated by the arrow. The position numbering 
along the top of the aligmnents indicate the position relative to the start of the mature region. 
Rgure 6 depicts an alignment of exemplary novel lipase polypeptides of the invention (SEQ 

30 ID NOS:75-108). The sequences shown represent the predicted mature region. Hie 
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alignments shown in Figures 3-6 were prepared using the CLUSTALW multiple sequence 
alignment program, a part of the Vector Nil version 6 sequence analysis software package 
(Infonnax, Bethesda, MD), using default parameters. 

Another feature of the invention is an isolated or recombinant polypeptide 
5 encoded by a polynucleotide sequence which hybridizes under highly stringent conditions 
over substantially the entire length, or to a subsequence thereof comprising at least 100 
residues or more, of a polynucleotide sequence selected from SEQ ID NO: 1 to SEQ ID NO: 
54 (or a complementary sequence thereof) or a polynucleotide sequence encoding a 
polypeptide selected from SEQ ID NO: 55 to SEQ ID NO: 108 (or a complementary 

10 sequence thereof) or a fragment thereof (from either SEQ ID NO: 1-54 or SEQ ID NO: 55- 
108 which fragment can comprise lipase activity and/or enantioselective lipase activity) 
provided that the sequences do not correspond to or encode any of GenBank accession 
numbers: 1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, AAC12257, 
AAD30278, AAF40217, AAF63229, AB000617, AF134840, AF141874, AF237623, 

15 AJ297356, BAA11406, BAA22231, BAB05967. C69652, CAA00273, CAA00274, 

CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, CAB92662, CAB95850. 
D78508, E01340, E01903. E02083, E05047, JW0068, M74010, P37957, S23934, U78785, 
X95309, Z99105, and Z99108. 

Various aspects of the current invention comprise an isolated or recombinant 

20 polypeptide comprising a sequence having at least 97% amino acid sequence identity to any 
one of SEQ ID NO: 75 to SEQ ID NO: 108. Such polypeptide can optionally comprise or 
exhibit lipase activity (e.g., it can degrade geranyl butyrate or neryl butyrate or both). 
Additionally, such polypeptide can exhibit enantioselectivity for geranyl butyrate over neryl 
butyrate. Such polypeptide that exhibits enantioselectivity for geranyl butyrate can comprise 

25 a sequence selected from: SEQ ID NO:76, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:86, 
SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:101, SEQ ID NO:102, SEQ ID NO:104, SEQ 
ID NO: 107, SEQ ID NO: 108, SEQ ID NO:78, SEQ ID NO:87, SEQ ID NO:100, SEQ ID 
NO:75, SEQ ID NO:77, SEQ ID NO:88, SEQ ID NO:98, SEQ ID NO:99, SEQ ID NO: 103, 
or SEQ ID NO: 106. Alternatively, ttie polypeptide can exhibit enantioselectivity for neryl 

30 butyrate over geranyl butyrate. Such polypeptide that exhibits enantioselectivity for neryl 
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butyrate over geranyl butyrate can comprise a sequence selected from: SEQ ID NO:81, SEQ 
ID NO:82, SEQ ID NO:83, SEQ ID NO:85, SEQ ID NO:89, SEQ ID NO:90. SEQ ID 
NO:94, SEQ ID NO:95, SEQ ID NO:105, SEQ ID NO:84, SEQ ID NO:91, SEQ ID NO:92, 
orSEQIDNO:93. 

5 Furthermore, the polypeptide can comprise a polypeptide encoded by a 

polynucleotide sequence which hybridizes under highly stringent conditions over 
substantially the entire lengdi of a polynucleotide sequence selected from SEQ ID NO: 1-54 
(or a complementary sequence diereof), or by a polynucleotide sequence encoding a 
polypeptide sequence selected from SEQ ID NO: 55-108 (or a complementary sequence 

10 thereof), and wherein the polypeptide comprises one or more of: Lys at position 1; Thr at 
position 14; Ser at position 17; Arg at position 22; Glu at position 26; Pro at position 31; Gly 
at position 33; Glu at position 34; Pro at position 35; Pro or Thr at position 37; Ser or Lys at 
position 41; Gly at position 42; Arg or Glu at position 43; Ala at position 61; Tyr at position 
75; Gly at position 96; Ser at position 97; Thr at position 104; Ser at position 107; Ala at 

15 position 125; Gly at position 129; Val at position 134; Cys at position 138; Lys at position 
141; Lys at position 146; Thr at position 156; Met at position 160; Arg at position 166; or His 
at position 177. Alternatively, the polypeptide can comprise one or more of: Lys at position 
1; Thr at position 14; Ser at position 17; Arg at position 22; Glu at position 26; Pro at 
position 31; Gly at position 33; Glu at position 34; Pro at position 35; Pro or Thr at position 

20 37; Ser or Lys at position 41 ; Gly at position 42; Arg or Glu at position 43; Ala at position 
61; Tyr at position 75; Gly at position 96; Ser at position 97; Thr at position 104; Ser at 
position 107; Ala at position 125; Gly at position 129; Val at position 134; Cys at position 
138; Lys at position 141; Lys at position 146; Thr at position 156; Met at position 160; Arg at 
position 166; or His at position 177 (or an equivalent position to that of SEQ ID NO: 75). 

25 Such polypeptide can comprise or exhibit lipase activity or the ability to 

degrade geranyl butyrate, neryl butyrate, or both neryl and geranyl butyrate. The polypeptide 
can also exhibit enantioselectivity for geranyl butyrate over neryl butyrate. A polypeptide 
exhibiting enantioselectivity for geranjd butyrate over neryl butyrate can comprise one or 
more of: Arg at position 22; Gly at position 33; Ser or Lys at position 41; Arg at position 43; 
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Ser at position 107; Lys at position 141; Lys at position 146; Met at position 160; or His at 
position 177, or can comprise one or more of: Arg at position 43; or Ser at position 107. 

Such polypeptide can alternatively comprise or exhibit enantioselectivity for 
neryl butyrate over geranyl butyrate. Such polypeptide can comprise one or more of: Ser at 
5 position 17; Arg at position 22; Pro at position 31; Gly at position 33; Ser or Lys at position 
41; Lys at position 141; Lys at position 146; Met at position 160; Arg at position 166; or His 
at position 177, or, can comprise one or more of: Ser at position 17; Pro at position 31; or 
Arg at position 166. 

In another aspect, the invention can comprise an isolated or recombinant 

10 polypeptide comprising a sequence having at least 94% amino acid sequence identity to the 
mature region of SEQ ID NO: 55, 61, 64, 65, 67, 68, 70, or 72. Alternatively, such 
polypeptide can comprise a sequence having at least 94% anoino acid sequence identity to the 
mature region of SEQ ID NO: 55, which polypeptide also can comprise a sequence selected 
from SEQ ID NO: 55, 58-62, 75-78, 80-88, or 94-108 (or the mature region thereof). 

15 Altematively, the polypeptide can comprise a sequence having at least 94% amino acid 
sequence identity to the mature region of SEQ ID NO: 61, which polypeptide also can 
comprise a sequence selected from SEQ ID NO: 55, 57-62, 75-78, 80-90, or 93-108. 
Altematively, the polypeptide can comprise a sequence having at least 94% amino acid 
sequence identity to the mature region of SEQ ID NO: 64, which polypeptide also can 

20 comprise a sequence selected from SEQ ID NO: 64, 71, or 72 (or the mature region thereof). 
Altematively, the polypeptide can comprise a sequence having at least 94% amino acid 
sequence identity to the mature region of SEQ ID NO: 65, which polypeptide can also 
comprise a sequence selected from SEQ ID NO: 65, 66, or 73 (or a mature region thereof). 
Altematively, the polypeptide can comprise a sequence having at least 94% amino acid 

25 sequence identity to the mature region of SEQ ID NO: 67, which polypeptide can also 
comprise the sequence SEQ ID NO: 67 (or the mature region thereof). Altematively, the 
polypeptide can comprise a sequence having at least 94% amino acid sequence identity to the 
mature region of SEQ ID NO: 68, which polypeptide can also comprise a sequence selected 
from SEQ ID NO: 68 or 101 (or the mature region thereof). Altematively, the polypeptide 

30 can comprise a sequence having at least 94% amino add sequence identity to tiie mature 
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re^on of SEQ ID NO: 70, which polypeptide can also comprise a sequence selected from 
SEQ ID NO: 63, 68-70, 82-83, 85-86, 96, or 101-102 (or the mature region thereof). 
Alternatively, the polypeptide can comprise a sequence having at least 94% amino acid 
sequence identity to the mature region of SEQ ID NO: 72, which polypeptide can also 
5 comprise a sequence selected from SEQ ID NO: 64, 71, or 72 (or a mature region thereof). 

In another aspect, the invention can comprise an isolated or recombinant 
polypeptide comprising a sequence having at least 85% amino acid sequence identity to the 
mature region of SEQ ID NO: 74, which polypeptide can also comprise a sequence selected • 
from SEQ ID NO: 63, 71-72, 74, or 79 (or a mature region thereof). 

10 In yet another aspect, the invention can comprise an isolated or recombinant 

polypeptide comprising a sequence having at least 99% amino acid sequence identity to the 
mature region of SEQ ID NO: 56. 

The extent of the region of identity or similarity can extend from a 
comparison window of at least 45 amino acids to the entire length of the lipase homologue 

15 polypeptide. In an embodiment, such polypeptides are identified by performing a sequence 
alignment witii any one or more of SEQ ID NO: 55 to SEQ ID NO: 108 using BLASTP with 
default parameters set to the desired percentage identity. Alternatively, the default 
parameters can be set to identify polypeptide sequences with greater identity to one or more 
of SEQ ID NO: 55 to SEQ ID NO: 108. 

20 Alternatively, polypeptides of the invention can be encoded by 

polynucleotides that correspond to any one, or part of SEQ ID NO: 1 to SEQ ID NO: 54 (or 
complementary polynucleotides thereof) and or a fragment thereof, which fragment can 
comprise lipase activity, TTie polypeptides of the invention can, likewise, be encoded by 
polynucleotides that hybridize under stringent or highly stringent conditions over 

25 substantiaDy the entire length of such polynucleotides, with the proviso that such sequences 
do not correspond to or encode any of the GenBank accession 1I6WA, 1I6WB, A02813, 
A02815,A34992, AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, 
AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, BAA11406, 
BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196, CAA64621, 

30 CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, 
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E02083, E05047, JW0068, M74010, P37957, S23934, U78785. X95309, Z99105, and 
Z99108. Similarly, polypeptides that are encoded by subsequences of any such 
polynucleotides, e.g., a subsequence comprising at least about 45 contiguous amino acid 
residues, sometimes comprising at least about 45 contiguous amino acid residues, and in 
5 some cases comprising 45 contiguous amino acid residues of the polypeptide are also a 
feature of the invention. In some instances, such polypeptides are substantially identical to 
one or more of SEQ ID NO; 55 to SEQ ID NO: 108 over at least 45 contiguous amino acid 
residues with the proviso that such sequences do not correspond to or encode any of the 
GenBank accession numbers listed above. In other cases, the polypeptides, regardless of 

10 length, display lipase activity and/or enantioselective lipase activity. 

The invention provides isolated or recombinant polypeptides encoded by a 
nucleic acid comprising a polynucleotide sequence selected from any of the following: a 
polynucleotide sequence selected from SEQ ID NO: 1 to SEQ ID NO: 54 (or a 
complementary polynucleotide sequence thereof); a polynucleotide sequence encoding a 

15 polypeptide selected ftom SEQ ID NO: 55 to SEQ ID NO: 108 (or a complementary 

polynucleotide sequence thereof); a polynucleotide sequence which hybridizes imder highly 
stringent conditions over substantially the whole length of any of the previous described 
polynucleotides, or which hybridizes to a subsequence of the same, comprising at least 100 
residues, again, with the proviso that none of the sequences corresponds to or is encoded by 

20 any of the GenBank accession numbers listed above; a polynucleotide sequence which 
comprises all, or a fragment of, any of the above described polynucleotides and which 
encodes a polypeptide comprising lipase activity and/or lipase enantioselective activity; a 
polynucleotide sequence encoding a polypeptide which comprises an amino acid sequence 
that is substantially identical over at least 45 contiguous amino acid residues of any one of 

25 SEQ ID NO: 55 to SEQ ID NO: 108, with the proviso that none of the sequences corresponds 
to or is encoded by any of the GenBank accession numbers listed above; or a polynucleotide 
sequence encoding a polypeptide comprising Upase activity and that is produced by mutating 
or recombining one or more of the polynucleotide sequences described above, yet again, with 
the proviso that none of the sequences corresponds to or is encoded by any of the G^iBank 

30 accession numbers listed above. The invention also provides an isolated or recombinant 
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polypeptide as described above which comprises an amino acid sequence of any of SEQ ID 
NO: 55 to SEQ ID NO: 108. 

Isolated or recombinant polypeptides as described above wherein the encoded 
polypeptide comprises lipase activity (e.g., against tributyrin, against tributyrin in DMF, 
5 against tributyrin after heat treatment (i.e., after the polypeptide has been heat treated); and/or 
enantioselective lipase activity (e.g., against neryl- butyrate or geranyl- butyrate) are also 
provided. Optionally, such polypeptides as described can comprise lipase activity against 
novel substrates (i.e., substrates upon which typical wild-type lipases do not act) such as, 
e.g., methyl esters, pentadecanolide, or oxacyclotridecan. Optionally the isolated or 

10 recombinant nucleic acid can encode a polypeptide which comprises enantioselective activity 
as well as comprising a polynucleotide sequence encoding a polypeptide with 
enantioselective lipase activity. Additionally, such isolated or recombinant polypeptides 
optionally are substantially identical over at least 45, at least 50, at least 75, at least 100, at 
least 125 , at least 150, at least 175, or at least 200 contiguous amino acids of any of the 

15 above described polypeptides. Alternatively, such isolated or recombinant polypeptides is 
substantially identical over at least 180, at least 212, at least 213, or at least 215 contiguous 
amino acid residues of the above described polypeptide. 

In various embodiments, the above described polypeptides comprise one or 
more of: a leader sequence, a precursor polypeptide, a secretion signal or a localization 

20 signal, an epitope tag, a fusion protein comprising one or more additional amino acid 
sequences, a polypeptide purification subsequence (e.g., an epitope tag, a FLAG tag, a 
polyhistidine sequence, a GST fusion), an N-terminus methionine residue, or a modified 
amino acid (e.g., a glycosylated amino acid, a PEGylated amino acid, a famesylated amino 
add, an acetylated amino acid, a biotinylated amino acid, an amino acid conjugated to a lipid 

25 moiety or to an organic derivatizing agent). 

A composition comprising one or more polypeptide comprising a modified 
amino acid and pharmaceuticaUy acceptable excipient and a composition comprising one or 
more above described polypeptide with a pharmaceuticaUy acceptable excipient are also 
provided Additionally, the invention provides a polypeptide which comprises a unique 

30 subsequence in a polypeptide selected from SEQ ID NO: 55 through SEQ ID NO: 108 
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wherein such subsequence is unique as compared to a polypeptide sequence which 
corresponds to an amino acid sequence (or which is encoded by a nucleic acid sequence) 
corresponding to any of GenBank accession numbers 1I6WA, 1I6WB, A02813, 
A02815^4992, AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, 
5 AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, BAA11406, 
BAA22231, BAB05967. C69652, CAA00273, CAA00274, CAA02196. CAA64621, 
CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, 
E02083, E05047, JW0O68, M74010, P37957, S23934, U78785, X95309, Z99105, and 
Z99108. Also provided is a polypeptide which is specifically bound by a polyclonal antisera 

10 raised against at least one antigen comprising at least one amino acid sequence from SEQ ID 
NO: 55 to SEQ ID NO: 108 (or a fragment thereof) whra-e the antisera is subtracted with a 
polypeptide corresponding to an amino acid sequence (or which is encoded by a nucleic acid 
sequence) corresponding to any of GenBank accession numbers 1I6WA, 1I6WB, A02813, 
A02815A34992, AAA22574, AAB31769, AAC12257. AAD30278, AAF40217, 

15 AAF63229, AB000617, AF134840, AF141874, AE237623. AJ297356, BAA11406, 
BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196, CAA64621, 
CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, 
E02083, E05047. JW0O68, M74010. P37957, S23934, U78785, X95309, Z99105, and 
Z99108. 

20 In other aspects the invention includes an antibody or antisera produced by 

administering an above described polypeptide of the invention to a mammal and wherein the 
antibody or antisera specifically binds at least one antigen which comprises a polypeptide 
sequence (or fragment tiiereof) from SEQ ID NO: 55 to SEQ ID NO: 108 and which 
antibody or antisera does not specifically bind to a polypeptide encoded by a nucleic acid 

25 corresponding to, or an amino acid sequence corresponding to one or more of GenBank 
accession numbers 1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, 
AAC12257, AAD30278, AAF40217, AAF63229. AB000617, AF134840. AF141874, 
AF237623, AJ297356. BAA11406, BAA22231, BAB05967, C69652, CAA00273, 
CAA00274, CAA02196, CAA64621, CAB12064, CAB12664. CAB51971, CAB92662, 

30 CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, M74010, P37957, 
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S23934, U78785, X95309, Z99105, and Z99108. In yet other aspects, the invention includes 
an antibody or antisera that specifically binds a polypeptide which comprises an amino acid 
sequence (or fragment tiieieof) from SEQ ID NO: 55 to SEQ ID NO; 108 and which 
antibody or antisera does not specifically bind to a peptide encoded by a nucleic acid 
corresponding to, or an anaino acid sequence corresponding to one or more of GenBank 
accession numbers: 1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, 
AAC12257, AAD30278, AAF40217, AAF63229, AB0006I7, AF134840, AF141874, 
AF237623, AJ297356, BAA11406, BAA22231, BAB05967, C69652, CAA00273, 
CAA00274, CAA0219,6, CAA64621, CABI2064, CAB12664, CAB51971, CAB92662, 
CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, M74010, P37957, 
S23934, U78785, X95309, Z99105, and Z99108. 

Making Polypeptides 

Recombinant methods for producing and isolating lipase homologue 
polypeptides of the invention are described above. In addition to recombinant production, 
the polypeptides can be produced by direct peptide synthesis using solid-phase techniques 
{seCy e.g., Stewart et al, (1969) Solid-Phase Peptide SyntliesiSy WH Freeman Co, San 
Francisco; Merrifield J (1963) J Am Chem So . 85:2149-2154). Peptide synthesis can be 
performed using manual techniques or by automation. Automated synthesis can be achieved, 
for example, using Applied Biosystems 431A Peptide Synthesizer (Perkin Elmer, Foster 
City, CaUf .) in accordance with the instructions provided by the manufacturer. For example, 
subsequences can be chemically synthesized separately and combined using chemical 
methods to provide full-length lipase homologues. Fragments of the lipase polypeptides of 
the invention, as discussed herein, are also a feature of the invention and can be synthesized 
by using the procedures described above. 

Polypeptides of the invention can be produced by introducing into a 
population of cells a nucleic acid of the invention, wherein the nucleic acid is operatively 
linked to a regulatory sequence effective to produce the encoded polypeptide, culturing the 
cells in a culture medium to produce the polypeptide, and optionally isolating the polypeptide 
from the ceUs or from the culture medium. 
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In another aspect, polypeptides of the invention can be produced by 
introducing into a population of cells a recombinant expression vector comprising at least 
one nucleic acid of the invention, wherein the at least one nucleic acid is operatively linked 
to a regulatory sequence effective to produce the encoded polypeptide, culturing the cells in a 
5 culture medium under suitable conditions to produce the polypeptide encoded by the 

expression vector, and optionally isolating the polypeptide from the cells or from the culture 
medium. 

Using Polypeptides 
Antibodies 

10 In another aspect of the invention, a Upase homologue polypeptide of the 

invention is used to produce antibodies which have, e.g., diagnostic and/or therapeutic uses, 
e.g., related to the activity, distribution, and expression of lipase homologues. 

Antibodies to lipase homologues of the invention can be generated by 
methods well known in the art. Such antibodies can include, but are not limited to, 

15 polyclonal, monoclonal, chimeric, humanized, single chain. Fab fragments and fragments 
produced by an Fab expression library. Antibodies, i.e., those which block receptor binding, 
are especially preferred for therapeutic use. 

Lipase homologue polypeptides for antibody induction do not require 
biological activity; however, the polypeptide or oligopeptide must be antigenic. Peptides 

20 used to induce specific antibodies can have an amino acid sequence consisting of at least 10 
amino acids, preferably at least 15 or 20 amino acids. Short stretches of a Upase polypeptide 
can be fused with another protein, such as keyhole limpet hemocyanin (KLH), and antibody 
produced against the chimeric molecule. 

Methods of producing polyclonal and monoclonal antibodies are known to 

25 tiiose of skill in the art, and many antibodies are available. See, e.g., Coligan (1991) Current 
Protocols in Immunology Wiley/Greene, NY; and Harlow and Lane (1989) Antibodies: A 
Laboratory Manual Cold Spring Harbor Press, NY; Stites et al. (eds.) Basic and Clinical 
Immunology (4th eA) Lange Medical Publications, Los Altos, CA, and references cited 
therein; Coding (1986) Monoclonal Antibodies: Principles and Practice (2d ed.) Academic 

30 Press, New Yoric, NY; and Kohler and Milstein (1975) Nature 256: 495^97. Other suitable 
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techniques for antibody preparation include selection of libraries of recombinant antibodies 
in phage or similar vectors. See, Huse et al. (1989) Science 246: 1275-1281; and Ward, et al. 
(1989) Nature 341: 544-546. Specific monoclonal and polyclonal antibodies and antisera 
will usually bind with a Kd of at least about 0.1 pM, preferably at least abont 0.01 fiM or 
5 better, and most typically and preferably, 0.001 jxM or better. 

Detailed methods for preparation of chimeric (humanized) antibodies can be 
found in U.S. Patent 5,482,856. Additional details on humanization and other antibody 
production and engiiieering techniques can be found in Borrebaeck (ed.) (1995) Antibody 
Engineering, Edition Freeman and Company, NY (Borrebaeck); McCafferty et al. (1996) 
10 Antibody Engineering, A Practical Approach IRL at Oxford Press, Oxford, England 

(McCafferty), and Paul (1995) Antibody Engineering Protocols Humana Press, Towata, NJ 
(Paul). 

In one useful embodiment, this invention provides for fully humanized 
antibodies against the lipase homologues of the invention. Humanized antibodies are 

15 especially desirable in applications where the antibodies are used as therapeutics in vivo in 
human patients. Hmnan antibodies consist of characteristically human immunoglobulin 
sequences. The human antibodies of this invention can be produced using a wide variety of 
methods (see, e.g., Lanick et al., U.S. Pat. No. 5,001,065, and Borrebaeck McCafferty and 
Paul, supra, for a review). In one embodiment, the human antibodies of the present 

20 invention are produced initially in trioma cells. Genes encoding the antibodies are then 
cloned and expressed in other cells, such as nonhuman mammalian ceUs. The general 
approach for producing human antibodies by trioma technology is described by Ostberg et al. 
(1983), Hvbridoma 2: 361-367, Ostberg, U.S. Pat No. 4,634,664, and Engehnan et al., U.S. 
Pat. No. 4,634,666. The antibody-producing cell lines obtained by this method are called 

25 triomas because they are descended from three cells; two human and one mouse. Triomas 
have been found to produce antibody more stably than ordinary hybridomas made from 
human cells. 
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SEQUENCE VARIATIONS 

Conservatively Modified Variations 

Lipase homologue polypeptides of the present invention include 
conservatively modified variations of the sequences disclosed herein as SEQ ID NO: 55 to 
5 SEQ ID NO: 108. Such conservatively modified variations comprise substitutions, additions 
or deletions which alter, add or delete a single amino acid or a small percentage of amino 
acids (typically less than about 5%, more typically less than about 4%, 3%, 2%, or 1%, or 
less) in any of SEQ ID NO: 55 to SEQ ID NO: 108. 

For example, a conservatively modified variation (e.g., deletion) of the 180 

10 amino acid polypeptide identified herein as SEQ ID NO: 75 will have a length of at least 171 
amino acids, preferably at least 173 amino acids, preferably at least 175 amino acids, more 
preferably at least 177 amino acids, and still more preferably at least 179 amino acids, 
corresponding to a deletion of less than about 5%, 4%, 3%, 2%, or 1% or less of the 
polypeptide sequence, 

15 Another example of a conservatively modified variation (e.g., a 

"conservatively substituted variation'^ of the polypeptide identified herein as SEQ ID NO: 
75 will contain "conservative substitutions" according to the six substitution groups set forth 
in Table 2 (supra), in up to about 9 residues (i.e., less than about 5%) of the 180 amino acid 
polypeptide. 

20 The lipase polypeptide sequence homologues of the invention, including 

conservatively substituted sequences, can be present as part of larger polypeptide sequences 
such as occur upon the addition of one or more domains for purification of the protein (e.g., 
poly his segments, FLAG tag segments, etc.), e.g., where the additional functional domains 
have little or no effect on the activity of the lipase portion of the protein, or where the 

25 additional domains can be removed by post synthesis processing steps such as by treatment 
with a protease. 

In various embodiments, the polypeptide comprises at least about 45, 50, 75, 
100, 125, 150, 175, 200 or at least about 215, or more contiguous amino acid residues of any 
of SEQ ID NO: 55 to SEQ ID NO: 108. Alternatively, the polypeptide comprises at least 
30 about 180 contiguous amino acids residues, at least about 212 contiguous amino acid 

89 


wo 02/06457 PCT/USOl/22160 

residues, at least about 213 contiguous amino acid residues, or at least about 215 amino acid 
residues of any of SEQ ID NO: 55 to SEQ ID NO: 108. 

DEFINING POLYPEPTIDES BY MMUNOREACTIVITY 

Because the polypeptides of the invention provide a variety of new 
5 polypeptide sequences as compared to other Upases, the polypeptides also provide new 
structural features which can be recognized, e.g., in immunological assays. The generation 
of antisera which specifically binds the polypeptides of the invention, as well as the 
polypeptides which are bound by such antisera, are a feature of the invention. 

The invention includes lipase homologue proteins that specifically bind to or 

10 that are specifically immunoreactive with an antibody or antisera generated against an 

immunogen comprising an amino acid sequence selected from one or more of SEQ ID NO: 
55 to SEQ ID NO: 108. To eliminate cross-reactivity with other lipases, the antibody or 
antisera is subtracted with available homologues such as those found in GenBank represented 
by or encoded by GenBank accession numbers 1I6WA, 1I6WB, A02813, A02815,A34992, 

15 AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617, 
AF134840, AF141874, AF237623, AJ297356, BAA11406, BAA22231, BAB05967, 
C69652, CAA00273, CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, 
CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, 
M74010, P37957, S23934, U78785, X95309, Z99105, and Z99108 (i.e., the "control" Upase 

20 homologue polypeptides). Proteins that can bind specifically as described above can be 
determined by alignmg any of SEQ ID NO: 55 to SEQ ID NO: 108 against the complete set 
of nucleic acids corresponding or encoded by: 1I6WA, 1I6WB, A02813, A02815,A34992, 
AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, AAF63229. AB000617, 
AF134840, AF141874, AF237623, AJ297356, BAA11406, BAA22231, BAB05967, 

25 C69652, CAA00273, CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, 

CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, 
M74010,P37957.S23934,U78785,X95309,Z99105,andZ99108. Where the GenBank 
sequence corresponds to a nucleic add, a polypeptide encoded by the nucleic acid is 
generated and used for antibody/antisera subtraction purposes. Where the nucleic acid 

30 corresponds to a non-coding sequence, e.g., a pseudo gene, an amino add which corresponds 
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to the reading j&rame of the nucleic acid is generated (e.g., synthetically), or is minimally 
modiiSed to include a start codon for recombinant production. 

In one typical format, the inamunoassay uses a polyclonal antiserum which 
was raised against one or more polypeptide comprising one or more of the sequences 
5 corresponding to one or more of SEQ ID NO: 55 to SEQ ID NO: 108, or a substantial 

subsequence thereof (i.e., at least about 30% of the full length sequence provided). The full 
set of potential polypeptide immunogens derived from SEQ ID NO: 55 to SEQ ID NO: 108 
are collectively referred to below as "the immunogenic polypeptides." The resulting antisera 
is optionally selected to have low cross-reactivity against the control lipase homologues and 

10 any other known homologues and any such cross-reactivity is removed by immunoabsorbtion 
with one or more of the control lipase homologues, or other known homologues, prior to use 
of the polyclonal antiserum in the inamunoassay. 

In order to produce antisera for xise in an inamunoassay, one or more of the 
inmiunogenic polypeptides is produced and purified as described herein. For example, 

15 recombinant protein may be produced in a mammalian cell line. An inbred strain of mice 
(used in this assay because results are more reproducible due to the virtual genetic identity of 
the mice) is immunized with the immunogenic protein(s) in combination with a standard 
adjuvant, such as Freund*s adjuvant, and a standard mouse immunization protocol {see, 
Harlow and Lane (1988) Antibodies, A Laboratorv Manual, Cold Spring Harbor 

20 Publications, New York, for a standard description of antibody generation, immunoassay 
formats and conditions that can be used to determine specific immunoreactivity). 
Alternatively, one or more synthetic or recombinant polypeptide derived from the sequences 
disclosed herein is conjugated to a carrier protein and used as an immunogen. 

Polyclonal sera are collected and titered against the immunogenic polypeptide 

25 in an immunoassay, for example, a solid phase immunoassay with one or more of the 

immunogenic proteins inomobilized on a solid support Polyclonal antisera with a titer of 10^ " 
or greater are selected, pooled and subtracted with the control lipase homologue polypeptides 
to produce subtracted pooled titered polyclonal antisera. 

Hie subtracted pooled titered polyclonal antisera are tested for cross reactivity 

30 against the control lipase homologues (e.g., as enumerated herein). Preferably at least two of 
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the immunogenic lipase homologuies are used in this determination, preferably in conjunction 
with at least two of the control lipase homologues, to identify antibodies which are 
speciJBcally bound by the immunogenic protein(s). 

In this comparative assay, discrinainatory binding conditions are determined 
5 for the subtracted titered polyclonal antisera which result in at least about a 5-10 fold higher 
signal to noise ratio for binding of the titered polyclonal antisera to the immunogenic lipase 
molecules as compared to binding to any control homologues. That is, the stringency of the 
binding reaction is adjusted by the addition of non-specific competitors such as albumin or 
non-fat dry milk, or by adjusting salt conditions, temperature, or the like. These binding 

10 conditions are used in subsequent assays for determining whether a test polypeptide is 
specifically bound by the pooled subtracted polyclonal antisera. In particular, test 
polypeptides which show at least a 2-5x higher signal to noise ratio than the control 
polypeptides under discriminatory binding conditions, and at least about a V& signal to noise 
ratio as compared to the immunogenic polypeptide(s), shares substantial stmctural similarity 

15 with the immunogenic polypeptide as compared to control polypeptides, and is, therefore a 
polypeptide of the invention. 

In another example, inmiunoassays in the competitive binding format are used 
for detection of a test polypeptide. For example, as noted, cross-reacting antibodies are 
removed from the pooled antisera mixture by immunoabsorbtion with the control lipase 

20 polypeptides. The inamunogenic lipase homologue polypeptide(s) are then inmiobilized to a 
solid support which is exposed to the subtracted pooled antisera. Test proteins are added to 
the assay to compete for binding to the pooled subtracted antisera. The ability of the test 
protein(s) to compete for binding to the pooled subtracted antisera as compared to the 
immobilized protein(s) is compared to the ability of the immunogenic polypeptide(s) added 

25 to the assay to compete for binding (the immunogenic polypeptides compete effectively with 
the inmiobilized immunogenic polypeptides for binding to the pooled antisera). The percent 
cross-reactivity for the test proteins is calculated, using standard calculations. 

In a parallel assay, the ability of the control proteins to compete for binding to 
. the pooled subtracted antisera is determined as compared to the ability of the immunogenic 

30 polypeptide(s) to compete for binding to the antisera. Again, the percent cross-reactivity for 

92 


wo 02/06457 PCTAJSOl/22160 

the control polypeptides is calculated, using standard calculations. Where the percent cross- 
reactivity is at least 5-lOx as high for the test polypeptides, the test polypeptides are said to 
specifically bind the pooled subtracted antisera. 

In general, the immunoabsorbed and pooled antisera can be used in a 
5 competitive binding immunoassay as described herein to compare any test polypeptide to the 
immunogenic polypeptide(s). In order to make this comparison, the two polypeptides are 
each assayed at a wide range of concentrations and the amount of each polypeptide required 
to inhibit 50% of the binding of the subtracted antisera to the immobilized protein is 
determined using standard techniques. If the amount of the test polypeptide required is less 

10 than twice the amount of the immunogenic polypeptide that is required, then the test 

polypeptide is said to specifically bind to an antibody generated to the immunogenic protein, 
provided the amount is at least about 5-lOx as high as for a control polypeptide. 

As a final determination of specificity, the pooled antisera is optionally fully 
immunosorbed with the immunogenic polypeptide(s) (rather than any control polypeptides) 

15 until litde or no binding of the resulting immunogenic polypeptide subtracted pooled antisera 
to the inmaunogenic polypeptide(s) used in the immunosorbtion is detectable. This fully 
immimosorbed antisera is then tested for reactivity with the test polypeptide. If litfle or no 
reactivity is observed (i.e., no more than 2x the signal to noise ratio observed for binding of 
the fully immunosorbed antisera to the immunogenic polypeptide), then the test polypeptide 

20 is specifically bound by the antisera elicited by the inmiunogenic protein. 

ENANnOSELECnVE LIPASE ACTiyTTY 

As described previously, enantiomers are non-superimposable stereoisomers 
of a molecule. In other words, they are mirror images of each other. Enantiomers of a 
molecule have identical melting points, boiling points, densities, refi:active indexes, etc. 
25 However one form rotates plane-polarized Ught to the right while the other enantiomer 
rotates it to the left. In fact, enantiomers are often designated as (+) or (-) forms of the 
molecule. Alternatively, the forms can be labeled as cis and trans forms of the molecule. 

Even though enantiomers share many identical properties, when they interact 
with other molecules that are also stereochemically specific, differing results (e.g., products) 
30 can result, depending upon which form {cis or trcms) interacts with the other molecule. Most 
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enzymes and many other molecules in biological systems are stereochemicaUy specific. 
Thus, the proper enantiomeric form of a molecule can be important if a desired result is to be 
achieved. This is true both in biological/phannacological situations as well as in industrial 
settings. 

5 For example (+) glucose is a commonly metabolized sugar and is extremely 

important in, e.g., industrial yeast fermentation. However, (-) glucose (i.e., the opposite 
enantiomeric form of glucose) is not commonly metabolized in animals or yeast, etc. 
Numerous other examples of such differences exist, such as: (+) glutamic acid/(-) glutamic 
acid (only one is used as a flavor enhancer); (+) carvone/(-) carvone (one smells of spearmint 

10 while the other smells of caraway); and (+) chloromycetin/(-) Chloromycetin (only one has 
antibiotic properties), etc. 

Not only can opposing enantiomers be selectively useful or have different 
uses, but in some situations one enantiomer can interfere with the usage of its opposing form. 
For example, (+) ephedrine has no drug activity and also interferes with the action of its 

15 opposing enantiomer (i.e., (-) ephedrine). 

Thus, enzymes specific for interaction with a specific enantiomeric form of a 
substrate would be extremely useful in a myriad of chemical/industrial and clinical settings. 
For example, a degradative enzyme that was enantioselective for (+) ephedrine could be used 
to aid in purification of (-) ephedrine fi-om a mixed population (racemic) of the 2 

20 enantiomers. 

The lipase homologue polypeptides of the current invention were screened for 
enantioselective lipase activity on neryl butyrate and geranyl butyrate. Again, while the 
current assays screened with respect to neryl/geranyl butyrate {see^ EXAMPLE II), it will be 
appreciated that the lipase homologues of the invention optionally display lipase and/or 
25 enantioselective lipase activity with respect to a number of different substrates (e.g., 

neryl/geranyl acetate, tributyrin, methyl esters, etc.). Geranyl butyrate is the trans isomer of 
3,7-dimethyl-2,6-octadien-l-yl butyrate while neryl-butyrate is the cis isomer of the same 
compound Both neryl and geranyl butyrate have industrial uses, e.g., as precursors, etc. in 
the perfume/fiagrance industry. 
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The enantioselectivity of the Upase homologue polypeptides of the invention 

was determined by measuring the enantiomeric ratio or 'TE." The enantiomeric ratio is 

determined by the equation; 

ln[l-c(l+DE(p)] 

5 E= - 

bi[l-c(l-DE(p)] 

in which c = the percent total substrate conversion (expressed as a decimal) and DE(p) is the 
' diastereomeric excess (i.e., the percent product of a first isomer minus the percent product of 
a second isomer) of the products. 

10 Figure 1 shows the enantioselectivity of the newly created lipase homologue 

polypeptides of the invention for neryl and geranyl butyrate. As can be seen, specific clones 
created had specificity for either neryl butyrate or geranyl butyrate. 

In other aspects, such isolated or recombinant polypeptide comprises an 
amino acid sequence of any one of SEQ ID NO: 55 through SEQ ED NO: 108 over a 

15 comparison window of at least 45 contiguous amino acids. 

In some embodiments, the invention comprises such an isolated or 
recombinant polypeptide that is at least 45 contiguous amino acid residues of a polypeptide 
encoded by a coding polynucleotide sequence wherein the polynucleotide sequence is 
selected from: a polynucleotide sequence from any of SEQ ID NO: 1 to SEQ ID NO: 54, a 

20 polynucleotide sequence that encodes a polypeptide selected from any of SEQ ID NO: 55 
through SEQ ID NO: 108; or a polynucleotide sequence that hybridizes under stringent 
conditions over substantially the entire length of the above polynucleotide sequence or which 
hybridizes to a subsequence comprising at least about 100 nucleic acids, provided that none 
of the sequences corresponds to or encodes any of GenBank accession numbers: 1I6WA, 

25 1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, AAC12257, AAD30278, 
AAF40217, AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, 
BAA11406, BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196, 
CAA64621, CAB 12064, CAB12664, CAB51971, GAB92662. CAB95850, D78508, E01340, 
E01903, E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, 

30 andZ99108. 
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Additionally, the invention provides such isolated or recombinant polypeptide 
wherein the polypeptide is enantioselective for either a cis form substrate enantiomer or for a 
trans form substrate enantiomer and optionally wherein the enantiomeric ratio is at least 2 or 
more, at least 5 or more, at least 10 or more, at least 50 or more, or at least 100 or more, 
5 The invention also provides such isolated or recombinant polypeptide wherein 

the identity is determined by a sequence alignment performed using BLASTP with default 
parameters set to measure a desired identity (see above). Additionally, which polypeptide 
comprises an amino acid sequence of any of SEQ ID NO: 55 through SEQ ED NO: 108 
and/or wherein the identity is determined by a sequence alignment using BLASTP with 

10 default parameters set to measure a desired identity. 

Additionally the invention comprises an isolated or recombinant polypeptide 
that is at least 90, at least 94, at least 95, at least 96, at least 97, at least 98, at least 99 or more 
percent identical over a comparison window of 45 contiguous amino acids (or 50, 75, 100, 
125, 150, 175, 200, 180, 212, 213, or 215 contiguous amino acids) of one or more of SEQ ID 

15 NO: 55 through SEQ ID NO: 108. Also, the invention provides an isolated or recombinant 
polypeptide identified by performing a sequence alignment with any one or more of SEQ ID 
NO: 55 through SEQ ID NO: 108 using BLASTP with default parameters set to measure a 
desired identity. 

COMNffiRCIAL/INDUSTiaAL METHODS AND COMPOSITIONS 
20 The lipase homologues of the invention are optionally used in compositions to 

accomplish numerous conmiercial and industrial procedures. The lipases of the invention are 
optionally used in the synthesis and/or degradation of specific lipids (i.e., to break down 
longer lipids and thus synthesize more desirable lipid molecules). 

Other non-limiting examples of commercial/industrial uses of the current. 
25 lipase homologues include: use as supplements in animal feeds, as agents of flavor 

modification and fat modification in human foodstuffs (e.g., cheese), as agents in the creation 
of food emulsifiers such as distilled monoglyceride, as agents in the production of fatty acid 
esters for texturing agents (e.g., for use in cosmetics), as aids in fiiactionation of fats, as 
means to remove unwanted types of lipids from lipid mixtures thus effectively concentrating 
30 the remaining lipid types (e.g., as a means to increase the percentage of "healthful" fish oils 
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in mixtures such as dietaiy supplements), as agents in tanning/processing leather, and as 
cleaning agents (seCy below). 

The lipases of the invention are also optionally immobilized on substrates, 
e.g., cellulose fibers, capillary tubes, various microchip stractures, etc. during use, thus, 
5 optionally permitting increased reaction periods, multiple reuse of the lipase molecules, 
avoidance of the need to purify out Upase molecules once they are no longer needed, etc. 

CLEANING SOLUTIONS 

The lipase homologues of the invention are favorably used in compositions 
that serve as cleaning solutions in wide variety of applications, including laundry detergents, 
10 contact lens cleansing solutions, and dry cleaning, among others. 

For example, the present invention provides the use of the novel Upase 
homologues of the invention in cleaning and detergent compositions, as well as such 
compositions containing mutant Upase enzymes. Such cleaning and detergent compositions 
can in principle have any physical form, but the Upase homologues are preferably 
15 incorporated in Uquid detergent compositions or in detergent compositions in the form of 

bars, tablets, sticks and the like for direct application, wherein they exhibit improved enzyme 
stability or performance. 

Among the Uquid compositions of the present invention are aqueous Uquid 
detergents having for example a homogeneous physical character, e.g. they can consist of a 
20 micellar solution of surfactants in a continuous aqueous phase, so-called isotropic Uquids. 
Alternatively, they can have a heterogeneous physical phase and they can be structured, 
containing suspended soUd particles such as particles of builder materials e.g. of the kinds 
mentioned below. In addition, the Uquid detergents according to the present invention can 
include an enzyme stabilization system, comprising calcium ion, boric acid, propylene glycol 
25 and/or short chain carboxyUc acids. Optionally, the detergents include additional enzyme 
components including, e.g., ceUulase, amylase, subtUisin, or proteases. 

In addition, powder detergent compositions can include, in addition to any one 
or more of the Upase homologues of the invention as described herein, such components as 
builders (such as phosphate or zeoUte builders), surfactants (such as anionic, cationic, non- 
30 ionic or zwitterionic type surfactants), polymers (such as acryUc or equivalent polymers), 
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bleach systems (such as perborate- or amino-containing bleach piwursors or activators), 
structurants (such as silicate structurants), alkali or acid to adjust pH (i.e., a pH adjuster), 
humectants, and/or neutral inorganic salts. Furthermore, a number of other ingredients are 
normally present in the compositions of the invention, such as co-surfactants, tartrate 
5 succinate builder, neutralization system, suds suppressor, other enzymes and other optional 
components. 

THERAPEUTIC AND PROPHYLACTIC METHODS AND COMPOSITIONS 

Lipases, including the lipase homologue polypeptides and their encoding 
nucleic acids, are optionally used in die therapeutic and/or prophylactic treatment of a 
10 number of medical diseases/disorders/conditions. 

For example, lipase treatment of subjects is optionally useful in conditions 
such as, but not limited to: Crohn's disease, cystic fibrosis, celiac disease, pancreatic 
abnormalities (e.g., chronic pancreatitis), nonspecific indigestion, and other.gastrointestinal 
mal-absorption problems. 
15 The amount of lipase polypeptide given in current treatments of such 

conditions is variable (as is the normal level of intrinsic lipase) and is preferably adjusted by 
a physician to a subject's specific medical condition. In some clinical situations lipase 
supplements are given in combination with supplements of other enzymes (e.g., amylases, 
proteolytic enzymes, etc.) to help in treatment. As detailed below, the nucleic acids of the 
20 current invention are also optionally utilized in treatment of medical conditions. 

The present invention also includes methods of therapeutically or 
prophylactically treating a disease or disorder by administering, in vivo or ex vivo, one or 
more nucleic acids or fragments thereof or polypeptides or fragments thereof of the invention 
described above (or compositions comprising a pharmaceutically acceptable excipient and 
25 one or more such nucleic acids or polypeptides) to a subject, including, e.g., a mammal, 
including, e.g., a human, primate, mouse, pig, cow, goat, rabbit, rat, guinea pig, hamster, 
horse, sheep; or a non-mammalian vertebrate such as a bird (e.g., a chicken or duck) or a fish, 
or invertebrate. 

In one aspect of the invention, in ex vivo methods, one or more cells, or a 
30 population of cells of interest of the subject (e.g., tumor cells, tumor tissue sample, organ 
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cells, blood cells, cells of the skin, lung, heart, muscle, brain, mucosae, liver, intestine, 
spleen, stomach, lymphatic system, cervix, vagina, prostate, mouth, tongue, etc.) are obtained 
or removed from the subject and contacted with an amount of a polypeptide of the invention 
that is effective in prophylactically or therapeutically treating a disease, disorder, or other 
5 condition. The contacted cells are then returned or delivered to the subject to the site from 
which they were obtained or to another site (e.g., including those defined above) of interest in 
the subject to be treated. If desired, the contacted cells may be grafted onto a tissue, organ, 
or system site (including all described above) of interest in the subject using standard and 
well-known grafting techniques or, e.g., delivered to the blood or lymph system using 

10 standard dehvery or transfusion techniques. 

The invention also provides in vivo methods in which one or more cells or a 
population of cells of interest of the subject are contacted directly or indirectly with an 
amount of a polypeptide of the invention effective in prophylactically or therapeutically 
treating a disease, disorder, or other condition. In direct contact/administration formats, the 

15 polypeptide is typically administered or transferred directly to the cells to be treated or to the 
tissue site of interest (e.g., tumor cells, tumor tissue sample, organ cells, blood cells, cells of 
the skin, lung, heart, muscle, brain, mucosae, liver, intestine, spleen, stomach, lymphatic 
system, cervix, vagina, prostate, mouth, tongue, etc.) by any of a variety of formats, 
including topical administration, injection (e.g., by using a needle and/or syringe), or vaccine 

20 or gene gun delivery, pushing into a tissue, organ, or skin site. Hie polypeptide can be 

delivered, for example, intramuscularly, intradermally, subdermaUy, subcutaneously, orally, 
intraperitoneally, intrathecally, intravenously, or placed within a cavity of the body 
(including, e.g., during surgery), or by inhalation or vaginal or rectal administration. 

In in vivo indirect contact/administration formats, the polypeptide is typically 

25 administered or transferred indirectly to the cells to be treated or to the tissue site of interest, 
including those described above (such as, e.g., skin cells, organ systems, lymphatic system, 
or blood cell system, etc.), by contacting or administering the polypeptide of the invention 
directly to one or more cells or population of cells from which treatment can be facilitated. 
For example, specific cells (e.g., tumor cells) within the body of the subject can be treated by 

30 contacting cells of the blood or lymphatic system, skin, or an organ with a sufficient amount 
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of the polypeptide such that delivery of the polypeptide to the site of interest (e.g., tissue, 
organ, or cells of interest or blood or lymphatic system within the body) occurs and effective 
prophylactic or ther^eutic treatment results. Such contact, administration, or transfer is 
typically made by using one or more of the routes or modes of administration described 
5 above. 

In another aspect, the invention provides ex vivo methods in which one or 
more cells of interest or a population of cells of interest of the subject (e.g., tumor cells, 
tumor tissue sample, organ cells, blood cells, cells of the skin, lung, heart, muscle, brain, 
mucosae, liver, intestine, spleen, stomach, lymphatic system, cervix, vagina, prostate, mouth, 

10 tongue, etc.) are obtained or removed jfrom the subject and transformed by contacting said 
one or more cells or population of cells with a polynucleotide construct comprising a target 
nucleic acid sequence of the invention or fragments thereof, that encodes a biologically 
active polypeptide of interest (e.g., a polypeptide of the invention) that is effective in 
prophylactically and/or therapeutically treating the disease, disorder, or other condition. The 

15 one or more cells or population of cells is contacted with a sufficient amount of the 

polynucleotide construct and a promoter controlling expression of said nucleic acid sequence 
such that uptake of the polynucleotide construct (and promoter) into the cell(s) occurs and 
sufficient expression of the target nucleic acid sequence of the invention results to produce an 
amount of die biologically active polypeptide effective to prophylactically and/or 

20 therapeutically treat the disease, disorder, or condition. The polynucleotide construct may 
include a promoter sequence (e.g., CMV promoter sequence) that controls expression of the 
nucleic acid sequence of the invention and/or, if desired, one or more additional nucleotide 
sequences encoding at least one or more of another polypeptide of the invention, a cytokine, 
adjuvant, or co-stimulatory molecule, or other polypeptide of interest 

25 Following transfection, the transformed cells are returned, delivered, or 

transferred to the subject to the tissue site or system from which they were obtained or to 
another site (e.g., tumor cells, tumor tissue sample, organ cells, blood cells, cells of the skin, 
lung, heart, muscle, brain, mucosae, liver, intestine, spleen, stomach, lymphatic system, 
cervix, vagina, prostate, mouth, tongue, etc.) to be treated in the subject. If desired, the cells 

30 may be grafted onto a tissue, skin, organ, or body system of interest in the subject using 
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Standard and well-known grafting techniques or delivered to the blood or lymphatic system 
using standard delivery or transfusion techniques. Such delivery, administration, or transfer 
of transformed cells is typically made by using one or more of the routes or modes of 
administration described above. Expression of the target nucleic acid occurs naturally or can 
5 be induced (as described in greater detail below) and an amount of the encoded polypeptide 
is expressed sufficient and effective to treat the disease or condition at the site or tissue 
system (or at another site within the subject). 

In another aspect, the invention provides in vivo methods in which one or 
more cells of interest or a population of cells of the subject (e.g., including those cells and 

10 cell(s) systems and subjects described above) are transformed in the body of the subject by 
contacting the cell(s) or population of cells with (or administering or transferring to the 
cell(s) or population of cells using one or more of the routes or modes of administration 
described above) a polynucleotide construct comprising a nucleic acid sequence of the 
invention that encodes a biologically active polypeptide of interest (e.g., a polypeptide of the 

15 invention) that is effective in prophylactically and/or therapeutically treating the disease, 
disorder, or other condition. 

The polynucleotide construct can be directly administered or transferred to 
cell(s) suffering from the disease or disorder (e.g., by direct contact using one or more of the 
* routes or modes of administration described above). Alternatively, the polynucleotide 

20 construct can be indirectly administered or transferred to cell(s) suffering from the disease or 
disorder by first direcfly contacting non-diseased ceD(s) or other diseased cells using one or 
more of the routes or modes of administration described above with a sufficient amount of 
the polynucleotide construct comprising the nucleic acid sequence encoding the biologically 
active polypeptide, and a promoter controlling expression of the nucleic acid sequence, such 

25 that uptake of the polynucleotide construct (and promoter) into the cell(s) occurs and 
sufficient expression of the nucleic acid sequence of the invention results to produce an 
amount of the biologically active polypeptide effective to prophylactically and/or 
therapeutically treat the disease or disorder, and whereby the polynucleotide construct or the 
resulting expressed polypeptide is transferred naturally or automatically from the initial 

30 delivery site, system, tissue or organ of the subject's body to the diseased site, tissue, organ 
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or system of the subject's body (e.g., via the blood or lymphatic system). Expression of the 
target nucleic acid occurs naturally or can be induced (as described in greater detail below) 
such that an amount of the encoded polypeptide expressed is sufficient and effective to treat 
the disease or condition at die site or tissue system. The polynucleotide construct may 
5 include a promoter sequence (e.g., CMV promoter sequence) that controls expression of the 
nucleic acid sequence and/or, if desired, one or more additional nucleotide sequences 
encoding at least one or more of another polypeptide of the invention, a cytokine, adjuvant, 
or co-stimulatory molecule, or other polypeptide of interest. 

In each of the in vivo and ex vivo treatment methods as described above, a 
10 composition comprising an excipient and the polypeptide or nucleic acid of the invention can 
be administered or deUvered. In one aspect, a composition comprising a pharmaceutically 
acceptable excipient and a polypeptide or nucleic acid of the invention is administered or 
delivered to the subject as described above in an amount effective to treat the disease or 
disorder. 

15 In another aspect, in each in vivo and ex vivo treatment method described 

above, the amount of polynucleotide administered to the cell(s) or subject can be an amount 
sufficient that uptake of said polynucleotide into one or more ceUs of the subject occurs and 
sufficient expression of said nucleic acid sequence results to produce an amount of a 
biologically active polypeptide effective to enhance an immune response in the subject, 

20 including an immune response induced by an immunogen (e.g., antigen). In another aspect, 
for each such method, the amount of polypeptide administered to cell(s) or subject can be an 
amount sufficient to enhance an immune response in the subject, including that induced by 
an immunogen (e.g., antigen). 

In yet another aspect, in each in vivo and ex vivo treatment method described 

25 above, the amount of polynucleotide administered to the cell(s) or subject can be an amount 
sufficient that uptake of said polynucleotide into one or more cells of the subject occurs and 
sufficient expression of said nucleic acid sequence results to produce an amount of a 
biologically active polypeptide effective to produce a tolerance or anergy response in the 
subject In another aspect, for each such method, the amount of polypeptide administered to 
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cell(s) or subject can be an amount sufficient to produce a tolerance or anergy response in the 
subject. 

In yet another aspect, in an in vivo or in vivo treatment method in which a 
polynucleotide construct (or composition comprising a polynucleotide construct) is used to 
5 deliver a physiologically active polypeptide to a subject, the expression of the polynucleotide 
construct can be induced by using an inducible on- and off-gene expression system- 
Examples of such on- and off-gene expression systems include the Tet-On™ Gene 
Expression System and Tet-Off"^ Gene Expression System (see, e.g., Clontech Catalog 
2000, pg. 110-111 for a detailed description of each such system), respectively. Other 

10 controllable or inducible on- and off-gene expression systems are known to those of ordinary 
skill in the art. With such system, expression of the target nucleic of the polynucleotide 
construct can be regulated in a precise, reversible, and quantitative maimer. Gene expression 
of the target nucleic acid can be induced, for example, after the stable transfected cells 
containing the polynucleotide construct comprising the target nucleic acid are delivered or 

15 transferred to or made to contact the tissue site, organ or system of interest. Such systems are 
of particular benefit in treatment methods and formats in which it is advantageous to delay or 
precisely control expression of the target nucleic acid (e.g., to allow time for completion of 
surgery and/or healing following surgery; to allow time for the polynucleotide construct 
comprising the target nucleic acid to reach the site, cells, system, or tissue to be treated; to 

20 aUow time for the graft containing cells transformed with the construct to become 

incorporated into the tissue or organ onto or into which it has been spliced or attached, etc.). 

Therapeutic compositions comprising one or more lipase homologue 
polypeptide of the invention are tested in appropriate in vitro and in vivo animal models of 
disease, to confirm efficacy, tissue metabolism, and to estimate dosages, according to 

25 methods well known in the art 

Administration is by any of the routes normally used for introducing a 
molecule into ultimate contact with blood or tissue cells. The lipase homologues of the 
invention are administered in any suitable manner, preferably with pharmaceutically 
acceptable carriers. Suitable methods of administering such lipase homologues in the context 

30 of the present invention to a patient are available, and, although more than one route can be 
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used to administer a particular composition, a particular route can often proAade a more 
immediate and more effective reaction than another route. 

Phannaceutically acceptable carriers are determined in part by the particular 
composition being administered, as well as by the particular method used to administer the 
5 composition. Accordingly, there is a wide variety of suitable formulations of pharmaceutical 
compositions of the present invention. 

Polypeptide compositions can be administered by a number of routes 
including, but not limited to oral, intravenous, intraperitoneal, intramuscular, transdermal, 
subcutaneous, topical, sublingual, or rectal means. Lipase homologue polypeptide 
10 compositions can also be administered via liposomes. Such administration routes and 
appropriate formulations are generally known to those of skill in the art. 

The lipase homologue, alone or in combination with other suitable 
components, can also be made into aerosol formulations (i.e., they can be "nebulized") to be 
administered via inhalation. Aerosol formulations can be placed into pressurized acceptable 
15 propellants, such as dichlorodifluoromethane, propane, nitrogen, and the like. 

Formulations suitable for parenteral administration, such as, for example, by 
intraarticular (in the joints), intravenous, intramuscular, intradermal, intraperitoneal, and 
subcutaneous routes, include aqueous and non-aqueous, isotonic sterile injection solutions, 
which can contain antioxidants, buffers, bacteriostats, and solutes that render the formulation 
20 isotonic with the blood of the intended recipient, and aqueous and non-aqueous sterile 

suspensions that can include suspending agents, solubilizers, thickening agents, stabilizers, 
and preservatives. The formulations of packaged nucleic acid can be presented in unit-dose 
or multi-dose sealed containers, such as ampules and vials. 

Parenteral administration and intravenous administration are preferred 
25 methods of administration. In particular, the routes of administration already in use for lipase 
related therapeutic agents, along with formulations in current use, are preferred routes of 
administration and formulation for the lipase polypeptides of the invention. 

Cells transduced with the lipase homologue nucleic acids as described above 
in the context of ex vivo therapy can also be administered intravenously or parenterally as 
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described above. It will be appreciated that the delivery of cells to patients is routine, e.g., 
delivery of cells to the blood via intravenous or intraperitoneal administration. 

The dose administered to a patient, in the context of the present invention is 
sufficient to effect a beneficial therapeutic response in the patient over time, depending on 
5 the application. The dose will be determined by the efficacy of the particular vector, or 

fonnxilation, and the activity lipase homologue employed and the condition of the patient, as 
well as the body weight or surface area of the patient to be treated. The size of the dose also 
will be determined by the existence, nature, and extent of any adverse side-efifects that 
accompany the administration of a particular vector, formulation, transduced cell type or the 

10 like in a particular patient. 

In determining the effective amount of the vector, cell type, or formulation to 
be administered in the treatment or prophylaxis of a disease/condition/etc., the physician 
evaluates circulating plasma levels, vector/cell/formulation/ lipase homologue toxicities, 
progression of the disease, and the production of anti-vector/ lipase homologue antibodies. 

15 The dose administered, e.g., to a 70 kilogram patient will be in the range 

equivalent to dosages of currently-used lipase related therapeutic proteins, and doses of 
vectors or cells which produce lipase homologue sequences are calculated to yield an 
equivalent amount of lipase homologue nucleic acid or expressed protein. The vectors of this 
invention can supplement the treatment of cancers and virally-mediated conditions by any 

20 known conventional therapy, including cytotoxic agents, nucleotide analogues (e.g., when 
used for treatment of HIV infection), biologic response modifiers, and the like. 

For administration, lipase homologues and transduced ceUs of the present 
invention can be administered at a rate determined by the LD-50 of the lipase homologue, 
vector, or transduced cell type, and the side-effects of the lipase homologues, vector or cell 

25 type at various concentrations, as applied to the mass and overall health of the patient 
Administration can be accomplished via single or divided doses. 

For example, in the therapeutic and prophylactic treatment methods of the 
invention described herein, an effective amount of a lipase nucleic acid (e.g., DNA or 
mRNA) of the invention (e.g., nucleic acid dosage) will generally be in the range of, e.g., 

30 from about 0.05 microgram/kilogram (kg) to about 50 mg/kg, usually about 0.005-5 mg^cg. 
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However, as will be understood, the effective amount of the nucleic acid (e.g., nucleic acid 
dosage) and/or polypeptide (e.g., polypeptide dosage) will vary in a manner apparent to those 
of ordinary skill in the art according to a number of factors, including the activity or potency 
of the polypeptide, the activity or potency of any nucleic acid construct (e.g., vector, 
5 promoter, expression system) to be administered, the disease or condition to be treated, and 
the subject to which or whom the nucleic acid is delivered. 

For delivery of some polypeptides, e.g., by delivering nucleic acids encoding 
such polypeptides, for example, adequate levels of translation and/or expression are achieved 
with a nucleic acid dosage of, e.g., about 0.005mg/kg to about 5 mg/kg. Dosages for other 

10 polypeptides (and nucleic acids encoding them) having a known biological activity can be 
readily determined by those of skill in the art according to the factors noted above. Dosages 
used for other known lipase related nucleic acids and polypeptides for particular diseases 
provide guidelines for determining dosage and treatment regimen for a nucleic acid or 
polypeptide of the invention. An effective amount of a lipase homologue polypeptide may be 

15 in the range of from about 1 microgram to about 1 milligram, and more typicaUy from about 
1 microgram to about 100 micrograms. 

A composition for use in therapeutic and prophylactic treatment methods of 
the invention described herein may comprise, e.g., a concentration of a lipase homologue 
nucleic acid (e.g., DNA or mRNA) of the invention of from about 0.1 microgram/milliliter 

20 (ml) to about 20 mg/ml and a pharmaceutically acceptable carrier (e.g., aqueous carrier). 

A composition for use in therapeutic and/or prophylactic treatment methods of 
the invention described herein may comprise, e.g., a concentration of a lipase homologue 
polypeptide of the invention in an amount as described above and herein and a 
pharmaceutically accq)table carrier (e.g., aqueous carrier). 

25 For introduction of recombinant lipase nucleic acid transduced cells into a 

patient, blood samples are obtained prior to infusion, and saved for analysis. Between 1 X 
10^ and 1 X 10^^ transduced cells are infused intravenously over 60- 200 minutes. Vital 
signs and oxygen saturation by pulse oximetry are closely monitored Blood samples are 
obtained 5 minutes and 1 hour following infusion and saved for subsequent analysis. 

30 Leukopheresis, transduction and reinfiision are optionally repeated every 2 to 3 months for a 
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total of 4 to 6 treatments in a one year period. After the first treatment, infusions can be 
performed on a outpatient basis at the discretion of the clinician. If the reinfusion is given as 
an outpatient, the participant is monitored for at least 4, and preferably 8 hours following the 
therapy. Transduced cells are prepared for reinfusion according to established methods. See, 
5 Abrahamsen et al. (1991) J Clin Apheresis 6:48-53; Carter et al. (1988) J Clin Apheresis 
4:113-117; Aebersold et al. (1988), J Immunol Methods 112: 1-7; Muul et al. (1987) J 
Immunol Methods 101:171-181 and Carter et al. (1987) Transfusion 27:362-365. After a 

6 12 

period of about 2-4 weeks in culture, the cells should number between 1 X 10 and 1X10. 
In this regard, the growth characteristics of cells vary from patient to patient and from cell 

10 type to cell type. About 72 hours prior to reinfusion of the transduced cells, an aliquot is 
taken for analysis of phenotype, and percentage of cells expressing the therapeutic agent. 

If a patient undergoing infusion of a vector or transduced cell or protein 
formulation develops fevers, chills, or muscle aches, he/she receives the appropriate dose of 
aspirin, ibuprofen, acetaminophen or other pain/fever controlling drug. Patients who 

15 experience reactions to the infusion such as fever, muscle aches, and chills are premedicated 
30 minutes prior to the future infusions with either aspirin, acetaminophen, or, e.g., 
diphenhydramine. Meperidine is used for more severe chills and muscle aches that do not 
quickly respond to antipyretics and antihistamines. Cell infusion is slowed or discontinued 
depending upon the severity of the reaction. 

20 The current invention provides methods to therapeutically or prophylactically 

treat a gastrointestinal lipid related condition/disease/disorder by hydrolyzing a lipid through 
expressing in a target cell, or contacting a target cell, with an effective amount of polypeptide 
of the invention (or a fragment thereof) both wherein such target cell is in culture and 
wherein such target cell is within a subject to be treated. The current invention also provides 

25 a method of therapeutic or prophylactic treatment of a gastrointestinal lipid related 

condition/disease/disorder in a subject wherein the subject is administered a polypeptide of 
the invention in an amoimt effect to treat the condition/disease/disorder, including wherein 
the subject is a mammal or more specifically, a human, and wherein the polypeptide is 
administered in vivo, in vitro, or ex vivo (or a combination of such) to one or more cells of 

30 the subject Such polypeptides include compositions of polypeptides comprising the 
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polypeptide and a pharmaceutically acceptable excipient, which is administered to a subject 
m an amount effective to treat a gastrointestinal lipid related condition/disease/disorder (e.g., 
cystic fibrosis, celiac disease, Crohn* s disease, indigestion, and obesity 

Another provision of the invention is a method of hydrolyzing a lipid to 
5 therapeutically or prophylactically treat a gastrointestinal lipid related 

condition/disease/disorder by introducing into a target cell a nucleic acid of the invention, or 
a fragment thereof, which is operably linked to a regulatory sequence active in a target cell 
such that introduction of the polynucleotide results in expression of the nucleic acid in an 
amount sufficient to hydrolyze the lipid. Such method optionally comprises directiy 

10 administering the nucleic add to a subject in an amount sufficient to introduce the nucleic 
acid into one or more cells and wherein the subject comprises a mammal (or a human) and 
wherein the nucleic acid optionally comprises a vector. Yet another provision of the 
invention is a method of therapeutically or prophylactically treating a gastrointestinal lipid 
related condition/disease/disorder by expressing in a target cell (or contacting a target cell 

15 with an effective amount of) a polynucleotide of the invention, or a jfragment thereof, or of a 
polypeptide encoded thereby (or a fragment thereof). Such method can comprise wherein the 
target is in culture or wherein the target cell is within a subject. Additionally, the invention 
provides a method of therapeutically or prophylactically treating a gastrointestinal lipid 
related condition/disease/disorder in a subject by administering to the subject a 

20 polynucleotide of the invention (or a fragment thereof) or a polypeptide encoded thereby (or 
a fragment thereof) in an amount effective to treat the gastrointestinal lipid related 
condition/disease/disorder. Such method comprises optional embodiments wherein the 
subject is a mammal or a human and wherein the polynucleotide and/or polypeptide is 
administered in vivo, in vitro, or ex vivo (or a combination of such) to one or more cells of 

25 the subject and wherein a composition of the polynucleotide and/or polypeptide and a 

pharmaceutically acceptable excipient is administered to the subject in an amount effective to 
treat the gastrointestinal lipid related condition/disease/disorder (e.g., cystic fibrosis, celiac 
disease, Crohn's disease, indigestion, or obesity). 
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INTEGRATED SYSTEMS 

The present invention provides computers, computer readable media and 
integrated systems comprising character strings corresponding to the sequence information 
herein for the polypeptides and nucleic acids herein, including, e.g., those sequences listed 
5 herein and the various silent substitutions and conservative substitutions thereof. 

Various methods and genetic algorithms (GAs) known in the art can be used 
to detect homology or similarity between different character strings, or can be used to 
perform other desirable functions such as to control output files, provide the basis for making 
presentations of information including the sequences and the like. Examples include 

10 BLAST, discussed supra. Extensive examples of the use of sequences in silico are foimd in, 
e.g., PCT/USOO/01202 "METHODS FOR MAKING CHARACTER STRINGS, 
POLYNUCLEOTIDES AND POLYPEPTIDES HAVING DESIRED 
CHARACTERISTICS" by Selifonov et al., filed Jan. 18, 2000; PCT/USOO/01230 
"OLIGONUCLEOTIDE MEDIATED NUCLEIC ACID RECOMBINATION" by Crameii et 

15 al., filed Jan. 18, 2000; and PCT/USOO/01 138 "METHODS OF POPULATING DATA 
STRUCTURES FOR USE IN EVOLUTIONARY SIMULATIONS" by Selifonov and 
Stemmer, filed Jan. 18, 2000. 

Thus, different types of homology and similarity of various stringency and 
length can be detected and recognized in the integrated systems herein. For example, many 

20 homology determination methods have been designed for comparative analysis of sequences 
of biopolymers, for spell-checking in word processing, and for data retrieval from various 
databases. With an understanding of double-helix pair-wise complement interactions among 
4 principal nucleobases in natural polynucleotides, models that simulate annealing of 
complementary homologous polynucleotide strings can also be used as a foundation of 

25 sequence alignment or other operations typically performed on the character strings 

corresponding to the sequences herein (e.g., word-processing manipulations, construction of 
figure comprising sequence or subsequence character strings, output tables, etc.). An 
example of a software package with GAs for calculating sequence similarity is BLAST, 
which can be adapted to the present invention by inputting character strings corresponding to 

30 the sequences herein. 
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Similarly, standard desktop applications such as word processing software 
(e.g., Microsoft Word™ or Corel WordPerfect™) and database software (e.g., spreadsheet 
software such as Microsoft Excel™, Corel Quattro Pro™, or database programs such as 
Microsoft Access™ or Paradox™) can be adapted to the present invention by inputting a 
5 character string corresponding to the lipase homologues of the invention (either nucleic acids 
or proteins, or both). For example, the integrated systems can include the foregoing software 
having the appropriate character string information, e.g., used in conjunction with a user 
interface (e.g., a GUI in a standard operating system such as a Windows, Macintosh or 
UNUX system) to manipulate strings of characters. As noted, specialized aligrmient 

10 programs such as BLAST can also be incorporated into the systems of the invention for 
alignment of nucleic acids or proteins (or corresponding character strings). 

Integrated systems for analysis in the present invention typically include a 
digital computer with GA software for aligning sequences, as well as data sets entered into 
the software system comprising any of the sequences herein. The computer can be, e.g., a 

15 PC (Intel x86 or Pentium chip- compatible DOS™, OS2™ WINDOWS™ WINDOWS 
NT™, WINDOWS95™, WINDOWS98™ LINUX based machine, a MACINTOSIFm, 
Power PC, or a UNIX based (e.g., SUN^^ work station) machine) or other commercially 
common computer which is known to one of skill. Software for aligning or otherwise 
manipulating sequences is available, or can easily be constructed by one of skill using a 

20 standard programming language such as Visualbasic, Fortran, Basic, Java, or the Uke. 

Any controller or computer optionally includes a monitor which is often a 
cathode ray tube ("CRT") display, a flat panel display (e.g., active matrix liquid crystal 
display, liquid crystal display), or others. Computer circuitry is often placed in a box which 
includes numerous integrated circuit chips, such as a microprocessor, memory, interface 

25 circuits, and others. The box also optionally includes a hard disk drive, a floppy disk drive, a 
high capacity removable drive such as a writeable CD-ROM, and other common peripheral 
elements. Inputting devices such as a keyboard or mouse optionally provide for input from a 
user and for user selection of sequences to be compared or otherwise manipulated in the 
relevant computer system. 
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The computer typically includes appropriate software for receiving user 
instructions, either in the form of user input into a set parameter fields, e.g., in a GUI, or in 
the fonn of preprogrammed instructions, e.g., preprogrammed for a variety of different 
specific operations. The software then converts these instructions to appropriate language for 
instructing the operation of, e.g., fluid direction and transport controllers to carry out the 
desired operation. 

The software can also include output elements for controlling nucleic acid 
synthesis (e.g., based upon a sequence or an alignment of a sequences herein) or other 
operations which occur downstream from an aligimient or other operation performed using a 
character string corresponding to a sequence herein. 

In one embodiment, the invention provides an integrated system comprising a 
computer or computer readable medium comprising a database having one or more sequence 
records. Each of the sequence records comprises one or more character strings 
corresponding to a nucleic acid or polypeptide or protein sequence selected from SEQ ID 
NO: 1 to SEQ ID NO: 108. The integrated system further comprises a user input interface 
allowing a user to selectively view the one or more sequence records. In one such integrated 
system, the computer or computer readable medium comprises an alignment instruction set 
that aUgns the character strings with one or more additional character strings corresponding 
to a nucleic acid or polypeptide or protein sequence. 

One such integrated system includes an instruction set that comprises at least 
one of the following: a local sequence comparison or a local homology comparison 
determination, a sequence alignment or a homology alignment determination, a sequence 
identity or similarity search or a search for similarity determination, a sequence identity or 
similarity determination, a structural similarity search, a structure determination, a nucleic 
acid motif detennination, an amino add motif determination, a hypothetical translation, a 
determination of a restriction m^, a sequence recombination and a BLAST determination. 
In some embodiments, the system further comprises a readable output element that displays 
an alignment produced by the alignment instruction set In another embodiment, the 
computer or computer readable medium further comprises an instruction set that translates at 
least one nucleic acid sequence which comprises a sequence selected from SEQ ID NO: 1 to 


111 


wo 02/06457 


PCTAJSOl/22160 


SEQ ID NO: 54 into an amino acid sequence. The instruction set may select the nucleic acid 
by applying a codon usage instruction set or an instruction set which determines sequence 
identity to a test nucleic acid sequence. 

Methods of using a computer system to present information pertaining to at 
5 least one of a plurality of sequence records stored in a database are also provided. Each of 
the sequence records comprises at least one character string corresponding to SEQ ID NO: I 
to SEQ ID KG: 108. The method comprises determining at least one character string 
corresponding to one or more of SEQ ID NO: 1 to SEQ ID NO: 108 or a subsequence 
thereof; determining which of the at least one character string of the list are selected by a 

10 user; and displaying each of the selected character strings, or aligning each of the selected 
character strings with an additional character string. The method may further comprise 
displaying an alignment of each of the selected character strings with an additional character 
string and/or displaying the hst. 

The current invention provides a database of one or more character strings 

15 corresponding to polynucleotide sequences selected from SEQ ID NO: 1 to SEQ ID NO: 54 
or a polypeptide sequence selected from SEQ BD NO: 55 to SEQ ID NO: 108. Such database 
optionally comprises wherein one or more character string is recorded in a computer readable 
medium (e.g., that resides intemal or external to a computer). The invention also provides a 
method for manipulating a sequence record in a computer system by reading a character 

20 string corresponding (optionally selected by a user or wherein the user selects the character 
string from a database or inputs the character string into the computer system) to a 
polynucleotide sequence selected from SEQ ID NO: 1 to SEQ ID NO: 54 or a polypeptide 
sequence selected from SEQ ID NO: 55 to SEQ ID NO: 108 (or a subsequence thereof), 
performing an operation on the character string, and returning a result of the operation 

25 (optionally comprising transmitting the selected character string to an output device). The 
operations performed in such computer system optionally comprise any of the following: a 
local sequence comparison, a sequence alignment, a sequence identity or similarity search, a 
structural similarity search, a sequence identity or similarity determination, a structure 
determination, a nucleic acid motif determination, an amino acid motif determination, a 

30 hypothetical translation, a determination of a restriction map, a sequence recombination, or a 
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BLAST determinatioa. Such method can comprise aligning the selected character string with 
one or more additional character strings corresponding to a polynucleotide or polypeptide 
sequence; translating one or more character strings from SEQ ID NO: 1 to SEQ ID NO: 54 
into a character string corresponding to an amino acid sequence or translating a character* 
5 string selected from SEQ ID NO: 55 to SEQ ID NO: 108, into a character string 

corresponding to a polynucleotide sequence; determining sequence identity or similarity 
between the selected character string and one or more additional character strings by 
evaluating codon usage (optionally determining optimal codon usage); and obtaining the 
result of the operation on a user output device (e.g., optionally selected fix)m a display 

10 monitor, a printer, and an audio output). The method of the invention for manipulating a 
sequence record in a computer system also comprises wherein the operation transmits the 
character string to a device (e.g., an oligonucleotide synthesizer or peptide synthesizer) 
capable of producing a physical embodiment of the character string (e.g. , a physical 
embodiment comprising a nucleic acid or polypeptide or peptide corresponding to a character 

15 string or a sub-portion thereof) 

KITS 

In an additional aspect, the present invention provides kits embodying the 
methods, composition, systems and apparatus herein. Kits of die invention optionally 
comprise one or more of the following: (1) an apparatus, system, system component or 

20 apparatus component as described herein; (2) instructions for practicing the methods 

described herein, and/or for operatmg the apparatus or apparatus components herein and/or 
for using the compositions herein; (3) one or more lipase composition or component; (4) a 
container for holding components or compositions, and, (5) packaging materials. 

In a further aspect, the present invention provides for the use of any apparatus, 

25 apparatus component, composition or kit herein, for the practice of any method or assay 
herein, and/or for the use of any apparatus or kit to practice any assay or method herein. 
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EXAMPLES 

EXAMPLE I: DETECnON OF LIPASE SECRETING BACTERIA 

As described above, the nucleic acid and amino acid sequence of SEQ ID NO: 
1 through SEQ ID NO: 20 and SEQ ID NO: 55 through SEQ ID NO: 74 were discovered and 
5 isolated in a number of Bacillus species (both species-typed and un-typed species). In order 
to choose Bacillus cultures that expressed lipase activity, two types of plate assays were 
performed 

The first type of plate assay comprised a rhodamine B assay (see, e.g., 
Kouker, G, et al.. Specific and sensitive plate assay for bacterial lipases, Appl Environ 

10 Microbiol (1987) 53:211-213. The assay entails preparing TGY media plates, onto which 
various Bacillus colonies were patched. The TGY media plates were prepared by mixing 5g 
tryptone, 5g yeast extract, 5g dextrose, and Ig K2HPO4 per liter of media. The media was 
autoclaved and cooled to approximately 60°C before 30 miUiliters of filtered sterilized 
soybean oil and 2 milliliters of filtered sterilized rhodamine B solution (0.1%) was 

15 vigorously naixed in. The media was then plated into petri dishes. 

If the Bacillus colonies that were patched onto the TGY plates secreted active 
lipase enzymes, such enzymes would act upon the soybean oil in the plates, thus releasing 
free fatty acids. The free fatty acids would then react with the rhodamine B to create a 
visible fluorescent orange compound. Thus, Bacillus colonies that expressed active lipase 

20 could be visually detected (after 24-48 hours) by the fluorescent orange halo around the 
colonies. 

The second type of plate based assay used to detect the presence of lipase 
activity was used to check for lipase activity of E. coli bacterial colonies. E, coli cultures 
were transfonned with expression vectors containing either the newly discover Bacillus 

25 lipase variants (e.g., as detected above) or with newly created (i.e., recombined) lipase 

homologue variants. The transformed call colonies were grown on plates containing LB 
media supplemented with tributyrin at a final concentration of 1%. Colonies expressing an 
active lipase, secreted such lipase into the surrounding media (which was hazy due to the 
tributyrin), thus, degrading the tributyrin and producing a clear media ring around the lipase 

30 active colonies. 
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EXAMPLE n: SCREENING LIPASE HOMQLOGUES FOR ENANnOSKT .RCnWTY 

A. Substrate Synthesis 

All materials were purchased from Sigma or Aldrich unless noted Neryl 
butyrate was prepared by from nerol and butyryl chloride in methylene chloride/pyridine. 
5 Geranyl deuterobutyrate was prepared from geraniol and deuterobutyric acid (Isotec) using 
DCC coupling in methylene chloride. Both compounds were purified by flash 
chromatography (ether/hexanes) and gave satisfactory analysis by mass spectrometry and 
NMR. 

B. Library Pre-Selection and Enzyme Preparation 

10 Transformants were robotically picked to 386-well microtiter plates 

containing 70 \xL growdi medium (2xYT, 0.5% glucose to suppress induction, 30 \ig/ml 
chloramphenicol) and grown 12-20 hours at 37°C, 300-rpm shaking speed in a Kuhner 
incubator. The cultures were then gridded via a Q-bot robot (Genetix, UK) to inducing agar 
(2xYT, 1.5% agar, ImM IPTG, 30 jig/ml chloramphenicol) in 22 cm x 22 cm bioassay trays 

15 using 0.25 mm pins, and incubated at 30'*C for 16-20 hours. The colonies were then overlaid 
with substrate (1% neryl butyrate or geranyl butyrate) in 150 mL of 1.5% agar containing 2 
mM Hepes, pH 7.4, and 1% Triton X-100 that had been heated to 45°C. The reaction was 
allowed to proceed at room temperature for 5 to 20 hours, until clearing zones around active 
colonies were visible. The trays were imaged against a black background with an Alpha 

20 Innotech Fluorchem imaging system, and the images were analyzed using Phoretix Array 
image analysis software. Active clones were identified based upon the intensity of the 
corresponding clearing zone, and transferred (5 ixL) from die master 384-well plates to rows 
1-7 of 96 well microtiter plates containing 200 pL growth medium. Hie final row of the 96- 
weD plate was spiked with 5 fiL cultures transformed with a plasmid that did not contain an 

25 active lipase as a negative background control. The cultures were grown overnight at 37°C at 
200-230 rpm shaking speed in a Kuhner incubator. The following day, 10 pL of each culture 
was dispensed into 200 jiL inducing media (2xYT, 1 mM IPTG, 30 Jig^ml chloramphenicol) 
in a second 96-wel] plate. Hie cultures were induced for 16-20 hours at 30°C, 200 rpm in a 
Kuhner incubator. The cells were then pelleted by centrifugation and the lipase-containing 

30 supematant assayed as described below. 
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C. Reactions, Mass Spectrometrical Analysis, and Results 

Ten ]iL of cell supernatant was added to 90 \xL reaction mix that 
contained 2.78 mM neryl butyrate, 2.78 mM geraniol deuterobutyrate, and 1 mM moipholine 
acetate, pH 7.4, in a 96-well plate. The plates were sealed with plastic tape and shaken on a 
5 MicroMix (Diagnostics Products Corporation) set to mix at amplitude 4, form 20. After 8 
hours, 10 jjL of this reaction mix was added to 90 40:50 HzOiMeOH. The final row of 
the plate was spiked with known concentrations of butyrate and deuterobutyrate (0 - 50 uM) 
to provide calibration curves. The plates were sealed (Microliter Analytical polypropylene 
& aluminum foil film) and analyzed by LCYMS for butyrate and deuterobutyrate 
10 concentrations. Clones showing desired specificity were then re-confirmed by GC/MS. 

While the foregoing invention has been described in some detail for purposes 
of clarity and understanding, it will be clear to one skilled in the art firom a reading of this 
disclosure that various changes in form and detail can be made without departing from the 

15 true scope of the invention. For example, all the techniques, methods, compositions, 
apparatus and systems described above may be used in various combinations. All 
publications, patents, patent applications, or other documents cited in this application are 
incorporated by reference in their entirety for all purposes to the same extent as if each 
individual publication, patent, patent application, or other document were individually 

20 indicated to be incorporated by reference for all purposes. 
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SEQUENCE LISTINGS 


SEQDDNO 

CLONE NAME 

SEQUENCE 

SEQIDNO: 1 

405 (punailus) 

ATGAAATTTGTAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGGTGCTGTCAGTCACATCGC 
TGTTTGCGATGCAGCCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTTGTTATGGTTCACGGTAT 
CGGAGGAGCTTCATACAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAGGGCTGGTCACGG 
GGCAAGCTGTATGCGGTTGATTTTTGGGACAA 
GACAGGGACGAATTATAACAATGGCCCGGTAT 
TATCACGATTTGTGCAAAAGGTTTTAGACGAA 
ACGGGTGPGAAAAAAGTnGATATTGTCGCTCA 
CAGTATGGGTGGCGCGAACACACCTTACTACA 
TAAAAAATCTGGACGGCGGAAATAAAATTGAA 
AACGTCGTAACGCTTGGCGGCGCGAACCGTTC 
GACGACAAGCAAGGCGCTTCCGGGAACAGATC 
CAAATCAAAAGATTTTATACACATCCATTTAC 
AGCAGTGCCGATATGATTGTCATGAATTACTT 
ATCAAAATTAGACGGTGCTAAAAACGCTCAAA 
TTCATGGCGTTGGGCACATTGGTTTATTGATG 
AACAGCCAAGTCAACAGCCTGATTAAAGAAGG 
ACTGAACGGCGGGGGCCAAAATACGAATTAA 

SEQIDNO:2 

406 (subtilis) 

ATGAAATTTGTAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGATGCTGTCTGTTACATCGC 
TGTTTGCGTTGCAGCCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTCGTTATGGTTCACGGTAT 
TGGAGGGGCATCATTCAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAGGGCTGGTCGCGG 
GACAAGCTGTATGCAGTTGATTTTTGGGACAA 
GACAGGCACAAATTATAACAATGGACCGGTAT 
TACCACGATTTGTGCAAAAGGTTTTAGATGAA 

CAGCATGGGGGGCGCGAACACACTTTACTACA 
TAAAAAATCTGGACGGCGGAAATAAAGTTGCA 
AACGTCGTGACGCTTGGCGGCGCGAACCGTTT 
GACGACAGGCAAGGCGCTTCCGGGAACAGATC 
CAAATCAAAAGATTTTATACACATCCATTTAC 
AGCAGTGCCGATATGATTGTCATAAATTACTT 
ATCAAGATTAGATGGTGCTAGAAACGTTCAAA 
TCCATGGCGTTGGACACATCGGCCTTCTGTAC 
AGCAGCCAAGTCAACAGCCTGATTAAAGAAGG 
GCTGAACGGCGGGGGACTCAATACAAATTAG 

SEQIDNO: 3 

402 (megaterium) 

ATGAAATTTGTAZVAAAGAAGGATCATTGCACT 
TGTAACAATTTTGGTGCTGTCAGTCACATCGC 
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TGTTTGCGATGCAGCCGTCAGCAAAAGCCGCT 
GACACAATCCAGTTGTTATGGTTCACTGGTAT 
CGGAGGAGCTTCATACAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAGGGCTGGTCACGG 
GGCAAGCTGTATGCGGTTGATTTTTGGGACAA 
GACAGGGACGAATTATAACAATGGCCCGGTAT 
TATCACGATTTGTGCAAAAGGTTTTAGACGAA 
ACGGGTGCGAAAAAAGTGGATATTGTCGCTCA 
CAGCATGGGTGGCGCGAACACACTTTACTACA 

AACGTCGTAACGCTTGGCGGCGCGAACCGTTT 
GACGACAAGCAAGGCGCTTCCGGGAACAGATC 
CAAATCAAAAGATTTTATACACATCCATTTAC 
AGCAGTGCCGATATGATTGTCATGAATTACTT 
ATCAAAATTAGACGGTGCTAAAAACGTTCAAA 
TTCATGGCGTTGGGCACATTGGTTTATTGATG 
AACAGCCAAGTCAACAGCCTGATTAAAGAAGG 
ACTGAACGGCGGGGGCCACAATACAAATTAA 

SEQIDN0:4 

400 (lentus) 

ATGAAATTTGTAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGGTGCTGTCAGTCACATCGC 
TGTTTGCGATGCAGCCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTTGTTATGGTTCACGGTAT 
. CGGAGGAGCTTCATACAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAGGGCTGGTCACGG 
GGCAAGCTGTATGCGGTTGATTTTTGGGACAA 
GACAGGGACGAATTATAACAATGGCCCGGTAT 
TATCACGATTTGTGCAAAAGGTTTTAGACGAA 
ACGGGTGCGAAAAAAGTGGATATTGTCGCTCA 
CAGCATGGGTGGCGCGAACACACTTTACTACA 

AACGTCGTAACGCTTGGCGGCGCGAACCGTTT 
GACGACAAGCAAGGCGCTTCCGGGAACAGATC 
CAAATCAAAAGATTTTATACACATCCATTTAC 
AGCAGTGCCGATATGATTGTCATGAATTACTT 
ATCAAAATTAGACGGTGCTAAAAACGTTCAAA 
TTCATGGCGTTGGGCACATTGGTTTATTGATG 
AACAGCCAAGTCAACAGCCTGATTAAAGAAGG 

SEQIDN0:5 

396 (circulans) 

ATGAAATTTATAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGGTGCTGTCAGTCACATCGC 
TGTTTGCGATGCAGCCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTTGTTATGGTTCACGGTAT 
CGGAGGAGCTTCATACAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAGGGCTGGTCACGG 
GGCAAGCTGTATGCGGTTGATTTTTGGGACAA 
GACAGGGACGAATTATAACAATGGCCCGGTAT 
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TATCACGATTTGTGCAAAAGGTTTTAGACGAA 

ACGGGTGCGAAAAAAGTGGATATTGTCGCTCA 

CAGCATGGGTGGCGCGAACACACTTTACTACA 

TAAAAAATCTGGACGGCGGAAATAAAATT6AA 

AACGTCGTAACGCTTGGCGGCGCGAACCGTTT 

GACGACAAGCAAGGCGCTTCCGGGAACAGATC 

CAAATCAAAAGATTTTATACACATCCATTTAC 

AGCAGTGCCGATATGATTGTCATGAATTACTT 

ATCAAAATTAGACGGTGCTAAAAACGTTCAAA 

TTCATC5GCGTTGGGCACATTGGTTTATTO 

AACAGCCAAGTCAACAGCCTGATTAAAGAAGG 

ACTGAACGGCGGGGGCCTCAATACAAATTAA 

SEQIDNO:6 

392 

(azotofonnans) 

ATGAAATTTGTAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGGTGCTGTCAGTCACATCGC 
TGTTTGCGATGCAGCCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTTGTTATGGTTCACGGTAT 
CGGAGGAGCTTCATACAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAGGGCTGGTCACGG 
GGCGAGCTGTATGCGGTTGATTTTTGGGACAA 
GACAGGGACGAATTATAACAATGGCCCGGTAT 
TATCACGATTTGTGCAAAAGGTTTTAGACGAA 
ACGGGTGCGAAAAAAGTGGATATTGTCGCTCA 
CAGCATGGGTGGCGCGAACACACTTTACTACA 

AACGTCGTAACGCTTGGCGGCGCGAACCGTTT 
GACGACAAGCAAGGCGCTTCCGGGAACAGATC 
CAAATCAAAAGATTTTATAGACATCCATTTAC 
AGCAGTGCCAATATGATTGTCATGAATTACTT 
ATCAAAATTAGACGGTGCTAAAAACGTACAAA 
TTCATGGCGTTGGGCACATTGGTTTATTGATG 
AACAGCCAAGTCAACAGCCTGATTAAAGAAGG 
ACTGAACGGCGGGGGCCTAGATACAAATTAA 

SEQIDNO;? 

398 (finnus) 

ATGAAATTTGTAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGGTGCTGTCAGTCACATCGC 
TGTTTGCGATGCAGCCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTTGTTATGGTTCACGGTAT 
CGGAGGAGCTTCATACAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAGGGCTGGTCACGG 
GGCAAGCTGTATGCGGTTGATTTTTGGGACAA 
GACAGGGACGAATTATAACAATGGCCCGGTAT 
TATCACGATTTGTGCAAAAGGTTTTAGACGAA 
ACGGGTGCGAAAAAAGTGGATATTGTCGCTCA 
CAGCATGGGTGGCGCGAACACACTTTACTACA 
TAAAAAATCTGGACGGCGGAAATAAAATTGAA 
AACGTCGTAACGCTTGGCGGCGCGAACCGTTT 
GACGACAAGCAAGGCGCTTCCGGGAACAGATC 
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CAAATCAAAAGATTTTATACACATCCATTTAC 
AGCAGTGCCGATATGATTGTCATGAATTACTT 
ATCAAAATTAGACGGTGCTAAAAACGCTCAAA 
TTCATGGCGTTGGGCACATTGGTTTATTGATG 
AACAGCCAAGTCAACAGCCTGATTAAAGAAGG 
ACTGAACGGCGGAGGCCACAATACAAATTAA 

SEQ ID NO: 8 

393 (badius) 

ATGAAATTTGTAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGGTGCTGTCAGTCACATCGC 
TGTTTGCGATGCAGCCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTTGTTATGGTTCACGGTAT 
CGGAGGAGCTTCATACAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAGGGCTGGTCACGG 
GGCAAGCTGTATGCGGTTGATTTTTGGGACAA 
GACAGGGACGAATTATAACAATGGCCCGGTAT 
TATCACGATTTGTGCAAAAGGTTTTAGACGAA 
ACGGGTGCGAAAAAAGTGGATATTGTCGCTCA 
CAGCATGGGTGGCGCGAACACACTTTACTACA 

AACGTCGTAACGCTTGGCGGCGCGAACCGTTT 
GACGACAAGCAAGGCGCTTCCGGGAACAGATC 
CAAATCAAAAGATTTTATACACATCCATTTAC 
AGCAGTGCCGATATGATTGTCAT6AATTACTT 
ATCAAAATTAGACGGTGCTAAAAACGTTCAAA 
TTCATGGCGTTGGGCACATTGGTTTATTGATG 
AACAGCCAAGTCAACAGCCTGATTAAAGAAGG 
ACTGAACGGCGGAGGCCACAATACAAATTAA 

SEQ1DN0:9 

Dc5h 

ATGAAATTTGTAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGATGCTGTCTGTTACATCGC 
TGTTTGCGTTGCAACCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTCGTTATGGTTCACGGTAT 
TGGAGGGGCATCATTCAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAGGGCTGGTCGCGG 
GACAAGCTGTATGCAGTTGATTTCAAGGACAA 
GACAGGCACAAATTATAACAATGGCCCGGTAT 
TATCACGATTTGTGCAAAAGGTTTTAGATGAA 
ACGGGTGCGAAAAAAGTGGATATTGTCGCTCA 
CAGCATGGGGGGCGCGAACACACTTTACTACA 
T7UXAAAATCTGGACGGCGGAAATAAAGTTGAA 

aacgtcgtgacgcttggcgkx:gccaaccgttt 
gacgacaggcaaggcgcttccgggaacagatc 

CAAATCAAAAGATTTTATACACATCCATTTAC 
AGCAGTGCCGATATGATTGTCATGAATTATTT 
ATCAAGATTAGATGGTGCGAGAAACGTTCAAA 
TCCATGGCGTTGGACACATCGGCCTTCTGTAC 
AGCAGCCAAGTCAACAGCCTGATTAAAGAAGG 
GCTGAACGGCGGGGGCCTCAATACAAATTAA 
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SEQIDNO: 10 

Dc5f 

ATGAAAriTCTAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGATGCTGTCTGTTACATCGC 
TGTTTGCGTTGCAACCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTCGTTATGGTTCACGGTAT 
TGGAGGGGCATCATTCAATTTTGCGGGAATTA 
AGAGCTACCTCGTATCTCAGGGCTGGTCGCGG 
GACAAGCTGTATGCAGTTGATTTCTAAGACAA 
AACAGGGAATAACCGCAACAATGGTCCGCGTC 
TATCGAGATTCGTCAAAGATGTGTTAGACAAA 
ACGGGTGCCAAAAAAGTAGATATTGTGGCTCA 
TAGTATGGGTGGAGCGAACACGCTATACTATA 

AACGTTGTCACAATTGGTGGAGCAAACGGACT 
CGTTTCAAGCAGAGCATTACCAGGCACAGATC 
CAAATCAAAAAATTCTTTACACATCCGTCTAT 
AGCTCAGCAGATCTTATTGTCGTCAACAGCCT 
CTCTCGTTTAATTGGCGCAAGAAACATCCTGA 
TCCATGGCGTTGGTCATATCGGTCTATTAACC 
TCAAGCCAAGTGAAAGGGTATATTAAAGAAGG 
ACTGAACGGCGGAGGCCTCAATACAAATTAA 

SEQIDNO: 11 

Dc5cl 

ATGAAAGTGATTTTTGTTAAGAAAAGGAGTTT 
GCAAATTCTTGTTGCCCTTGCCTTAGTGCTAG 
GTTCAATAGCCTTCATCCAGCCGAAAGAAGCC 
AAAGCGGCTGAGCATAATCCGGTTGTAATGGT 
GCATGGCATGGGTGGTGCGTCTTATAACTTTG 
CTTCGATCAAACGATACTTAGTATCACAGGGA 
TGGGATCAAAACCAACTTTTTGCAATCGATTT 
CATAGACAAAACAGGCAATAACCTAAACAATG 
GCCCGAGGCTCTCGAGATTCGTGAAAGACGTA 
CTAGCCAAAACGGGCGCCAAAAAAGTAGATAT 
TGTGGCTCATAGTATGGGCGGTGCGAACACGT 
TATACTATATTAAAAACCTAGACGGTGGAGAT 

TV "A "jv TV mnv^ *jv tv tv tv /^r^rn/^/^rnOTV tv nvniv /'50nV!'/^-A^0 
AAAATTvjAAAAuCj i\-va i U AU A 1 1 ALjVj lAjrVaAVj^- 

AAACGGACTCGTATCACTCAGAGCATTACCAG 
GCACCGATCCAAATCAAAAAATTCTTTACACA 
TCTGTCTATAGCTCAGCCGATCTCATTGTCGT 
CAACAGCCTTTCGCGTTTAATTGGCGCAAGAA 
a PHTPPTn ATY^P Afr^OnTTnGAC ATATCGGT 
CTATTAACCTCAAGCCAAGTCAAAGGCTATGT 
GAAAGAAGGATTGAATGGCGGGGGACAGAATA 
CAAATTAA 

SEQIDNO: 12 

Dc5a2 

ATGAAAGTGATTTTTGTTAAGAAAAGGAGTTT 

GCAAATTCTTGTTGTGCTTGCATTGGTGATC 

GTTCAATGGCCTTCATCCAGCCAAAAGAGATC 

AGAGCGGCTGAGCATAATCCGGTTGTGATGGT 

ACATGGCATGGGCGGTGCGTCTTATAACTTTG 
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CTTCGATTAAAAGTTACTTGGTATCACAAGGA 
TGGGATCGAAACCAATTATTTGCTATCGATTT 
CATAGACAAAACAGGTAATAACCGCAACAATG 
GTCCGCGTCTATCCAGATTCGTCAAAGATGTG 
CTAGCCAAAACAGGTGCCAAAAAAGTTGATAT 
TGTGGCTCATAGTATGG6CGGAGCGAACACGT 
TATACTATATTAAGAATCTAGACGGCGGCGAT 
AAAATAGAAAACGTTGTTACACTTGGTGGAGC 
GAACGGACTCGTTTCACTCAGAGCATTACCAG 
GCACCGATCCAAATCAAAAAATCCTTTACACA 
TCCGTCTACAGCTCAGCCGATCTTATCGTCGT 
CAACAGCCTCTCGCGTTTAATTGGCGCAAGAA 
ACGTCCTCATTCACGGCGTTGGTCACATCGGT 
CTATTAGCTTCAAGCCAAGTCAAAGGCTATAT 
CAAAGAAGGACTGAATGGCGGAGGCCAAAATA 
CAAATTAA 

SEQ ID NO: 13 

Dc512 

ATGAAAGTGATTTTTGTTAAGAAAAGGAGTTT 
GCAAATTCTCATTGCGCTTGCATTGGTGATTG 
GTTCAATGGCGTTTATCCAGCCGAAAGAGGCG 
AAGGCGGCTGAGCATAATCCGGTTGTGATGGT 
GCATGGCATTGGCGGTGCCTCTTATAACTTTT 
TTTCTATTAAAAGTTATTTGGCCACACAAGGC 
TGGGATCGAAACCAATTATATGCTATTGATTT 
CATAGACAAAACAGGAAATAACCGCAACAATG 
GTCCGCGTCTATCGAGATTCGTCAAAGATGTG 
TTAGACAAAACGGGTGCCAAAAAAGTAGATAT 
TGTGGCTCATAGTATGGGTGGAGCGAACACGC 
TATACTATATCAAGAATCTAGATGGCGGCGAT 
A A A ATTG AG A ArGTTGTC' AC AATTGGTGGAGC 
AAACGGACTCGTTTCAAGCAGAGCATTACCAG 
GCACAGATCCAAATCAAAAAATTCTTTACACA 
TCCGTCTATAGCTCAGCAGATCTTATTGTCGT 
CAACAGCCTCTCTCAGTTTAATTGGCGCAAGA 
AACATCCTGATCCAGGCGTTGGTCATATCGGT 
CTATTAACCTCAAGCCAAGTGAAAGGGTATAT 
TAAAGAAGGACTGAACGGCGGAGGCCTCAATA 
CAAATTAA 

SEO ID NO- 14 


ATGAAATTTGTAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGATGCTGTCTGTTACATCGC 
TGTTTGCGTTGCAACCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTCGTTATGGTTCACGGTAT 
TGGAGGGGCATCATTCAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAAGGCTGGTCGCGG 
GACAAGCTGTATGCAGTTGATTTCAGGGACAA 
GACAGGCAATAACTTAAACAACGGTCCAGTAT 
TATCGCGTTTCGTGAAAAAGGTATTAGATGAA 


122 


wo 02/06457 


PCT/USOl/22160 




ACCGGTGCGAAAAAAGTGGATATTGTCGCTCA 
CAGCATGGGCGGCGCTAACACGCTTTACTACA 
TAAAAAATTTGGATGGCGGTAATAAAATTGAA 
AACGTCGTAACACTTGGCGGCGCGAATCGTCT 
TGTGACAGGCAAGGCGCTTCCGGGTACTGATC 
CCAACCAAAAGATCTTGTACACATCCGTTTAC 
AGTAGTGCTGATATGATTGTTATGAATTACTT 
AACAAAATTAGACGGGGCTAAAAATGTTCAAA 
TTCATGGTGTCGGACATATCGGCCTTCTGTAC 
AGCAGCCAAGTCAACAGCCTGATTAAAGAAGG 
GCTTAACGGCGGAGGCCTCAATACAAATTAA 

SEQIDNO: 15 

Sgc 

ATGAAATTTGTAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGATGCTGTCTGTTACATCGC 
TGTTTGCGTTGCAACCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTCGTTATGGTTCACGGTAT 
TGGAGGGGCATCATTCAATTTTGCGGGAATTA 
AGAGCTATCTeGTATCTCAGGGCTGGTCGCGG 
GACAAGCTGTATGCAGTTGATTTCTGGGATAA 
GACAGGCAATAACTTAAACAACGGTCCAGTAT 
TATCGCGTTTTGTGAAAAAGGTATTAGATGAA 
ACCGGTGCGAAAAAAGTGGATATTGTCGCTCA 
CAGCATGGGCGGCGCTAACACGCTTTACTACA 
TAAAAAATTTGGATGGCGGTAATAAAATTGAA 
AACGTCGTAACACTTGGCGGCGCGAATCGTCT 
TGTGACAGGCAAGGCGCTTCCGGGTACTGATC 
CCAACCAAAAGATATTGTACACATCCGTTTAC 
AGTAGTGCTGATATGATTGTTATGAATTACTT 
ATCAAAATTAGACGGGGCTAAAAATGTTCAAA 
TTCATGGTGTCGGACATATCGGCCTTCTGTAC 
AGCAGCCAAGTCAATAGCCTGATTAAAGAAGG 
GCTTAACGGCGGAGGACTCAATACGAATTAA 

SEQIDNO: 16 
\ 

Sgd 

ATGAAATTTGTAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGATGCTGTCTGTTACATCGC 
TGTTTGCGTTGCAACCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTCGTTATGGTTCACGGTAT 
TGGAGGGGCATCATTCAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAGGGCTGGTCGCGG 
GACAAGCTGTATGCAGTTGATTTTAGTGAGAA 
AACAGGCAATAACTTAAACAACGGTCCAGTAT 
TATCGCGTTTTGTGAAAAAGGTATTAGATGAA 
ACCGGTGCGAAAAAAGTGGATATTGTCGCTCA 
CAGCATGGGCGGCGCTAACACGCTTTACTACA 
TAAAAAATTTGGATGGCGGTAATAAAATTGAA 
AACGTCGTAACACTTGGCGGCGCGAATCGTCT 
TGTAACAGGCAAGGCGCTTCCGGGTACT6ATC 
CCAACC7UVAAGATCTTGTACACATCCGTTTAC 
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AGTAGTGCTGATATGATTGTTATGAATTACTT 
ATCAAAATTAGACGGGGCTAAAAATGTTCAAA 
TTCATGGTGTCGGACATATCGGCCTTCTGTAC 
AGCAGCCAAGTCAACAGCCTGATTAAAGAAGG 
GCTTAACGGCGGGGGCCTGAATACGAATTAA 

SEQIDNO: 17 

Sgf 

ATGAAATTTGTAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGATGCTGTCTGTTACATCGC 
TGTTTGCGTTGCAACCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTCGTTATGGTTCACGGTAT 
TGGAGGGGCATCATTCAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAGGGCTGGTCGCGG 
GACAAGCTGTATGCAGTTGATTTCAAAGACAA 
GACAGGGAATAACCGCAACAATGGTCCGCGTC 
TATCGAGATTCGTCAAAGATGTGTTAGACAAA 
ACAGGAGCCAAAAAAGTAGATATTGTGGCTCA 
TAGTATGGGCGGAGCGAACACATTATACTATA 

AACGTTGTCACAATTGGTGGAGCAAACGGACT 
CGTTTCAAGCAGAGCATTACCAGGCACAGATC 
CAAATCAAAAAATTCTTTACACATCCGTCTAT 
AGCTCAGCAGATCTTATTGTCGTCAACAGTCT 
CTCTCGTTTAATTGGCGCAAGAAACGTCCAAA 
TCCATGGCGTTGGACATATCGGTCTATTAACC 
TCAAGCCAAGTCAAAGGATATATTAAAGAAGG 
ACTGAACGGCGGGGGCCTCAATACAAATTAA 

SEQ ID NO: 18 

Sgh 

ATGAAATTTGTAAAAAGAAGGATCCTTGCACT 
TGTAACAATTTTGATGCTGTCTGTTACATCGC 
TGTTTGCGTTGCAACCGTCAGCAAAAGCCGCT 
GAACACAATCCAGTCGTTATGGTTCACGGTAT 
TGGAGGGGCATCATTCAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAGGGCTGGTCGCGG 
GACAAGCTGTATGCAGTTGATTTCATTGACAA 
GACAGGAAATAACCGCAACAATGGTCCGCGTC 
TATCGAGATTCGTCAAAGATGTGTTAGACAAA 
ACAGGAGCCAAAAAAGTAGATATTGTGGCTCA 
TAGTATGGGCGGAGCGAACACATTATACTATA 

AACGTTGTCACAATTGGTGGAGCAAACGGACT 
CGTTTCAAGCAGAGCATTACCAGGCACAGATC 
CAAATCAAAAAATTCTTTACACATCCGTCTAT 
AGCTCAGCAGATCTTATTGTCGTCAACAGTCT 
CTCTCGTTTAATTGGCGCAAGAAACGTCCAAA 
TCCATGGCGTTGGACATATCGGTCTATTAACC 
TCAAGCCTAGTCAAAGGATATATTAAAGAAGG 
ACTGAACGGCGGAGGCCAAAATACAAATTAA 
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SEQIDNO: 19 

Mt2bl 

ATGAAAGTGATTTTTGTTAAGAAAAGGAGTTT 
GCAAATTCTTGTTGCCCTTGCCTTAGTGATAG 
GTTCAATGGCCTTCATCCAGCCAAAAGAAATC 
AAAGCAGCTGAGCACAATCCGGTTGTGATGGT 
ACATGGTATTGGAGGAGCGTCTTATAACTTTG 
CTTCGATTAAAAGTTATTTGGTTAACCAAGGC 
TGGGATCGAAACCAATTATTTGCTATCGATTT 
CATAGACAAAACAGGGAATAACCGCAACAATG 
GTCCTCGTTTATCTAGATTCGTCAAAGATGTG 
CTAGACAAAACGGGTGCCAAAAAAGTAGATAT 
TGTGGCGCATAGTATGGGCGGGGCGAACACGC 
TATACTATATTAAGAATCTAGATGGCGGCGAT 
AAAATTGAAAACGTCCar 1 C ACCA i i\3Cj IXjVjACjU 
AAACGGACTCGTTTCACTCAGAGCATTACCAG 
GAACAGATCCAAATCAAAAAATTCTCTATACA 
TCTGTCTATAGCTCAGCCGATTTGATTGTCGT 
CAACAGCCTTTCGCGTTTAACTGGCGCAAGAA 
ATGTCCTGATCCACGGCGTTGGCCATATCGGT 
CTATTAACCTCAAGCCAAGTGAAAGGGTATAT 
TAAAGAAGGACTGAACGGCGGGGGCCTAAATA 
CAAATTAA 

SEQIDNO:20 

H2a 

ATGAAATTTGTAAAAAGAAGGATCATTGCACT 
TGTAACAATTTTGATGCTGTCTGTTACATCGC 
TGTTTGCGTTGCAACCGTCAGCAAAAGGCGCT 
GAACACAATCCAGTCGTTATGGTTCACGGTAT 
TGGAGGGGCATCATTCAATTTTGCGGGAATTA 
AGAGCTATCTCGTATCTCAGGGCTGGTCGCGG 
GACAAGCTGTATGCAGTTGATTTCAGGGACAA 
GACAGGAAATAACCGCAACAATGGTCCGCGTC 
TATCTAAATTCGTCAAAGATGTGTTAGACAAA 
ACGGGTGCCAAAAAAGTAGATATTGTGGGTCA 
TAGTATGGGCGGGGCGAACACGCTATACTATA 
TTAAGAATCTAGATGGCGGCGATAAAATTGAG 
AACGTTGTCACJUii IXjUCCjCjACjCAA i 
CGTTTCAAGCAGAGCATTACCAGGCACAGATC 
CAAATCAAAAAATTCTTTACACATCCGTCTAC 
AAGCTCAGCCGATCTCATTGTCGTCAACAGTC 

qTprnpmp/^mqvpTV AT'PfJnr*'mr'3\ A(^A A AP Afl'PPP 
-La^ X ± J. X ririX X VjVjV^ X Of^ivl.\jrt./irt.\-*rlVJ X 

AAATCCATGGCGTTGGACATATCGGTCTATTA 
ACCTCAAGCCAAGTCAAAGGATATATTAAAGA 
AGGACTGAACGGCGGGGGACTAAATACAAATT 
AA 

SEQE)NO: 21 

lfl5(G2) 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAATTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCGCG 
GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 
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AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCGCGTTTTGTGAAAAAGGTATTAGATGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGCGGCGCTAACACGCTTTACTAC 
ATAAAAAATCTGGACGGCGGAAATAAAGTTGA 
AAACGTCGTAACGCTTGGCGGCACGAACCGTT 
CGACGACAAGCAAGGCGCTTCCGGGAACAGAT 
CCAAATCAAAAGATTTTATACACATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAATGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GACTGAACGGCGGGGGACTCAATACGAATTGA 

SEQIDNO: 22 

3C12 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAATTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 
GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCTAGATTCGTCAAAGATGTGCTAGACAA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGGGGCGCGAACACACTTTACTAC 

AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 
CGACGACAAGCAAGGCGCTTCCGGGAACAGAT 
CCAAATCAAAAGATTTTATACACATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGGGCTAAAAATGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GACTGAACGGCGGGGGACTCAATACGAATTGA 

SEQ IDNO: 23 

3N19(G2) 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAATTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 
GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 
GGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGAAAAAGGTATTAGATGA 
AACCGGTGCGAAAAAAGTGGACATTGTCGCTC 
ACAGCATGGGTGGCGCGAACACACTTTACTAC 
ATAAAAAATCTGGACGGCGGAAATAAAATTGA 
AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 
TGACGACAAGCAAGGCGCTTCCGGGAACAGAT 
CCAAATCAAAAGATTTTATACACATCCATTTA 
CGGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATCCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
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GACTGAACGGCGGGGGACTGAATACAAATTGA 

SEQIDNO:24 

G2.2 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TCGGAGGGGCATCATTCAATTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 
GG6CAAGCTGTATGCGGTTGATTTTTGGGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGCAAAAGGTTTTAGACGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGGGGCGCGAACACACTTTACTAC 

AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 
eGACGACAAGCAAGGCGCTTCCGGGAACAGAT 
CCAAATCAAAAGATTTTATACACATCCATTTA 
CGGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTACAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTeAACAGCCTGATTAAAGAAG 
GACTGAACGGCGGGGGACTCAATACGAATTGA 

SEQIDNO: 25 

2C3 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAATTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTGGCG 
GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGCAAAAGGTTTTAGACGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGCGGCGCGAACACACTTTACTAC 

ATAAA A AATTTOO ATTtfiP rtOT A AT A A A ATTO A 

AAACGTCGTCACCATTGGTGGAGCAAACGGAC 
TCGTTTCAAGCAGAGCATTACCAGGCACAGAT 
CCAAATCAAAAAATTCTTTACACATCCGTCTA 
TAGCTCAGCAGATCTTATTGTCGTCAACAGTC 
TCTCTCGTTTAATTGGCGCAAGAAAeGTCCAA 
ATCCATGGCGTTGGACATATCGGTCTATTAAC 
CTCAAGCCAAGTCAAAGGATATATTAAAGAAG 
GGCTTAACGGCGGGGGCCACAATACGAATTGA 

SEQIDNO:26 

2F11 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TCGGAGGAGCTTCATACAATTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 
GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGCAAAAGGTTTTAGACGA 
AACCGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGTGGCGCGAACACACTTTACTAC 
ATAAAAAATCTGGACGGCGGAAATAAAATTGA 
AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 
TGACGACAAGCAGGGCGCTTCCGGGAACAGAT 
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CCAAATCAAAAGATTTTATACACATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTACAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAAAGGATATATTAAAGAAG 
GACTGAACGGCGGAGGCCTAAATACGAATTGA 

SEQIDNO:27 

KV11(6C7) 

TGAACACAATCC^GTTGTTATGGTTCACGGTA 

TTGGAGGGGCATCATTCAGTTTTGCGGGAATT 

AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 

GGGCAAGCTGTATCCGGTTGATTTTTGGGACA 

AGACAGGGACGAATTATAACAATGGCCCGGTA 

TTATCACGATTTGTGCAAAAGGTTTTGGACGA 

AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 

ACAGTATGGGTGGCGCGAACACACTTTACTAC 

ATAAAAAATCTGGACGGCGGAAATAAAATTGA 

AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 

TGACGACAAGCAAGGCGCTTCCGGGTACTGAT 

CCCAACCAAAAGATCTTGTACACATCCGTTTA 

CAGTAGTGCTGATATGATTGTTATGAATTACT 

TATCAAAATTAGACGGGGCTAAAAATGTTCAA 

ATTCATGGCGTTGGGCACATTGGTTTAT^ 

GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 

GACTGAACGGCGGGGGCCTAAATACAAATTGA 

SEQBDNO: 28 

KV6(3A1) 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAGTTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 
GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGCAAAAGGTTTTGGACGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
AGAGTATGGGTGGCGCGAACACACTTTACTAC 
ATAAAAAATCTGGACGGCGGAAATAAAATTGA 
AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 
TGACGACAAGCAAGGCGCTTCCGGGTACTGAT 
CCCAACCAAAAGATCTTGTACACATCCGTTTA 
CAGTAGTGCTGATATGATTGTTATGAATTACT 
TATCAAAATTAGACGGGGCTAAAAATGTTCAA 
A i IVA i iialaCiCAC ATTCjGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GACTGAACGGCGGGGGCCTAAATACAAATTGA 

SEQIDNO:29 

KV2(2D1) 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TCGGAGGAGCTTCATACAGTTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 
GGGCAAGCTGTATGCGGTTX5ATTTTTGGGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATC^CGATTTGTGCAAAAGGTTTTAGACGA 
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AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGTGGCGCGAACACACTTTACTAC 
ATAAAAAATCTGGACGGCGGAAATAAAATTGA 
AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 
TGACGACAAGCAAGGCGCTTCCGGGAACAGAT 
CCCAACCAAAAGATCTTGTACACATCCGTTTA 
CAGTAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGGGCTAAAAATGTTCAA 
ATTCATGGTGTCGGACATATCGGCCTTCTGTA 
CAGCAGCCAAGTCAACAGCCTGATTAAAGAAG 
GACTGAACGGCGGGGGCCAAAATACAAATTGA 

SEQIDNO: 30 

N2.5 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TCGGAGGAGCTTCATACAGTTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 
GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGCAAAAGGTTTTAGACGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGGGGCGCGAACACACTTTACTAC 
ATAAAAAATCTGGACGGCGGAAATAAAATTGA 
AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 
TGACGACAAGCAAGGCGCTTCCGGGAACTGAT 
CCCAACCAAAAGATCTTGTACACATCCGTTTA 
CAGTAGTGCTGATATGATTGTTATGAATTACT 
TATCAAAATTAGACGGGGCTAAAAATGTTCAA 
ATTCATGGCGTTGGGCACACTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GACTGAACGGCGGGGGCCACAATACAAATTGA 

SEQIDNO:31 

KV5(2H6) 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 

TTGGAGGAGCATCATACAATTTTGCGGGAATT 

AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 

GGGCAAGCTGTATACGGTTGATTTTTGGGACA 

AGACAGGCACAAATTATAACAATGGCCCGGTA 

TTATCACGATTTGTGCAAAAGGTTTTAGACGA 

AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 

ACAGCATGGGTGGCGCGAACACACTTTACTAe 

ATAAAAAATCTGGACGGCGGAAATAAAATTGA 

AAACGTCGTAACGCTTGGCGGCGCGAATC6TC 

TTGTAACAGGCAAGGCGCTTCCGGGAACAGAT 

CCCAATCAAAAGATTTTGTACGCATCCGTTTA 

CAGCAGTGCCGATATGATT6TCATGAATTACT 

TATCAAAATTAGACGGOXSCTTVAAAACGTTCAA 

ATTCATGGCGTTGGGCACATTGGTTTAT^^ 

GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 

GACTGAACGGCGGGGGCCTGAATACAAATTGA 
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SEQIDNO: 32 

3E5 

TGAACACAATCCAGTCGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAATTOTOCGGGAATT 
AGGAGCTATGTCGTATCTCAGGGCTGGTCACG 
GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 
GGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGCAAAAGGTTTTAGATGA 
AACCGGTGCGAAAAAAGTGGACATTGTCGCTC 
ACAGCATGGGTGGCGCGAACACACTTTACTAC 
ATAAAAAATCTGGACGGCGGAAATAAAATTGA 
AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 
TGACGACAAGCAAGGCGCTTCCGGGAACAGAT 
CCAAATCAAAAGATTTTATACACATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGGGCTAAAAATGTTCAA 
ATCCATGGCGTTGGACACATCGGCCTTCTGTA 
CAGCAGCCAAGTCAACAGCCTGATTAAAGAAG 
GACTGAACGGCGGGGGCCTCAATACGAATTGA 

SEQIDNO: 33 

G2.1 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 

TCGGAGGGGCATCATTCAATTTTGCGGGAATT 

AGGAGCTATCTCGTATCTCAGGGCTGGTCACG 

GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 

AGACAGGGACGAATTATAACAATGGCCCGGTA 

TTATCACGATTTGTGCAAAAGGTTTTAGACGA 

AACCGGTGCGAAAAAAGTGGACATTGTCGCTC 

ACAGCATGGGCGGCGCTAACACGCTTTACTAC 

ATAAAAAATCTGGACGGCGGAAATAAAATTGA 

AAACGTCGTAACGCTTGGCGGCACGAACC6TT 

TGACGACAAGCAGGGCGCTTCCGGGAACAGAT 

CCAAATCAAAAGATTTTATACACATCCATTTA 

CAGCAGTGCCGATATGATTGTCATGAATTACT 

TATCAAAACTAGACGGTGCTAAAAACGTTCAA 

ATTCATGGCGTTGGGCACATTGGTTTATTG 

GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 

GACTGAACGGCGGGGGACTCAATACGAATTGA 

SEQIDNO: 34 

3H24(G2) 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 

TO^GAGGGGCATCATTCAATTTI^ 

AAGAGCTATCTCGTATCTCAGGGCTGGTCGCG 

GGACAAGCCGTATGCGGTTGATTTTTGGGACA 

AGACAGGGACGAATTATAACAATGGCCCGGTA 

TTATCACGATTTGTGCAAAAGGTTTTAGACAA 

AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 

ACAGCATGGGGGGCGCGAACACACTTTACTAC 

ATAAAAAATCTGGACGGCGGAAATAAAGTTGA 

AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 

TGACGACAAGCAAGGCGCTTCCGGGAACAGAT 

CCAAATCAAAAGATTTTATACACATCCATOT 
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CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GACTGAACGGCGGGGGACTCAATACGAATTGA 

SEQIDNO:35 

KV10(4G6) 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAATTTTGCGGGAATT 
AAGAGCTATCTCGTGTCTCAGGGCTGGCCGCG 
GGACAAGCTGTATGCAGTTGATTTTTGGGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGCAAAAGGTTTTAGACGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGTGGCGCGAACACACTTTACTAC 

ATA A A A A ATY^TYZnAPf^rrw^ A A AT* A A APTTYZ A 

AAGCGTCGTAAC ACTTGGCGGCGCGAATCGTC 
TTGTAACAGGGAAGGCGCTTCCGGGAACTGAT 
CCCAACCAAAAGATTTTATACACATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATTCATGGCGTCGGACATATCGGCCTTCTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GACTGAACGGCGGGGGCCACAATACAAATTGA 

SEQIDNO: 36 

KV12(6D4) 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TCGGAGGGGCATCATTCAGTTTTGCGGGAATT 
AGGAGCTATCTCGTATCTCAGGGCTGGCCGCG 
GGACAAGCTGTATGCGGTTGATTTTTGGGACA 
AGACAGGCACAAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGCAAAAGGTATTAGATGA 
AACCGGTGCGAAAAAAGTGGATATTGTeGCCC 
ACAGCATGGGTGGCGCGAACACACTTTACTAC 

A A A A A A A TY^nrv^ri A P riflpni^ A A A A A A fl'PTY^ A 

AAACGTCGTGACGCTTGGCGGCGCCAACCGTT 

TGACGACAGGCAAGGCGCTTCCGGGTACTGAT 

CCCAATCAAAAGATTTTATACACATCCGTTTA 

CAGCAGTGCCGATATGATTGTCATGAATTACT 

TATCAAAATTAGACGGTGCTAAAAACGTTCAA 

ATTCATGGCGTTGGGCAC7VTTGGTTTATTGAT 

GAACAGCCAAGTCAACAGGCTGA'OTAAAG 

GACTGAACGGCGGAGGCCACAATACAAATTGA 

SEQIDNO:37 

N2.2 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 

TCGGAGGGGCATCATTCAGTTTOXXIGG^^ 

AGGAGCTATCTCGTATCTCAGGGCTGGCCGCG 

GGACAAGCTGTATGCGGTTGATTTTTGGGACA 

AGACAGGCACAAATTATAACAATGGCCCGGTA 

TTATCACGATTTGTGCAAAAGGTATTAGATGA 

AACCGGTGCGAAAAAAGTGGATATTGTCGCCT 
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ACAGCATGGGTGGCGCGAACACACTTTACTAC 

AAACGTCGTGACGCTTGGCGGCGCCAACCGTT 
TGACGACAGGCAAGGCGCTTCCGGGTACTGAT 
CCCAATCAAAAGATTTTATACACATCCGTTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGGCTGATTAAAGAAG 
GACTGAACGGCGGAGGCCACAATACAAATTGA 

SEQIDNO: 38 

N2.3 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TCGGGGGGGCATCATTCAGTTTTGCGGGAATT 
AGGAGCTATCTCGTATCTCAGGGCTGGCCGCG 
GGACAAGCTGTATGCGGTTGATTTTTGGGACA 
AGACAGGCACAAATTATAACAATGGCCCGGTA 
TTATCAeGATTTGTGCAAAAGGTATTAGATGA 
AACCGGTGCGAAAAAAGTGGATATTGTCGCCC 
ACAGCATGGGTGGCGCGAACACACTTTACTAC 

AAACGTCGTGACGCTTGGCGGCGCCAACCGTT 
TGACGACAGGCAAGGCGCTTCCGGGTACTGAT 
CCCAATCAAAAGATTTTATACACATCCGTTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGGCTGATTAAAGAAG 
GACTGAACGGCGGAGGCCACAATACAAATTGA 

SEQIDNO: 39 

N2.1 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TCGGAGGGGCATCATTCAGTTTTGCGGGAATT 
AGGAGCTATCTCGTATCCCAGGGCTGGCCGCG 
GGACAAGCTGTATGCGGTTGATTTTTGGGACA 
AGACAGGCACAAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGCAAAAGGTATTAGATGA 
AACCGGTGGGAAAAAAGTGGATATTGTCGCCC 
ACAGCATGGGTGGCGCGAACACACTTTACTAC 

AAACGTCGTGACGCTTGGCGGCGCCAACCGTT 
TGACGACAGGCAAGGCGCTTCCGGGTACTGAT 
CCCAATCAAAAGATTTTATACACATCCGTTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGGCTGATTAAAGAAG 
GACTGAACGGCGGAGGCCACAATACAAATTGA 

SEQIDNO: 40 

KV4(2E12) 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGACATCATTCAATTTTGCGGGAATT 
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AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 
GGACAAGCTGTATGCGGTTGATTTTTGGGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTOXSTGCAAAAGGTTTTAGACGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGCGGCGCCAACACGCTTTACTAC 

A2\ACGTCGTGAC6CTTGGCGGCGCGAACCGTT 
TGACGACAAGCAAGGCGCTTCCGGGAACAGAT 
CCAAATCAAAAGATTTTATAC^CATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTCATTAAAGAAG 
GACTGAACGGCGGGGGCCACAATACAAATTGA 

SEQIDN0:41 

KV9(4C6) 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TCGGAGGGGCATCATTCAGTTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCGCG 
GGACAAGCTGTATGCAGTTGATTTTAGTGACA 
AAACAGGCACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGIKX!AAAAGGTTTTAGACGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGGGGCGCGAACACACTTTACTAC 

TV rn A A TV A TV AH^r^HV!!/*' A •TV?!/'50r!!!0'P A AT*A A A ATVTV^ A 

AAACGTCGTAACACTTGGCGGCGCGAACCGTT 
TGACGACAAGCAAGGCGCTTCCGGGTACTGAT 
CCCAACCAAAAGATCTTGTACACATCCATTTA 
CAGCAGTGCCGATATGGTTGTCATGAATTACT 
TATCAAAATTAGACGGGGCTAAAAATGTTCAA 
ATTCATGGTGTCGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GACTGAACGGCGGGGGCCACAATACGAATTGA 

SEQBDNO: 42 

7D6 

TAAACACAATCCAGTTGTTATGGTTCACGGTA 
T^KSGAGGGGCATCATACAATTTTGCGGGAATA 
AAGAGCTATCTCGTATCTCAGGGCTGGTCGCG 
GGACAAGCTGTATGCAGTTGATTTTAGTGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 

± XAXX^Av^oAX J. J. \7 X\jV-sAArxn.V7Vl X X X XrV^AV^V7rl 

AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGGGGCGCGAACACACTTTACTAC . 
ATAAAAAATCTGGACGGCGGTAATAAAATTGA 
AAACGTCGTAACACTTGGCGGGGCGAACCGTT 
TGACGACAAGCAAGGCGCTTCCGGGAACAGAT 
CCAAATCAAAAGATTTTATACACATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAACTAGACGGTGCTAAAAACGTTCAA 
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ATTCATGGCGTTGGGCAC?VTTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GACTGAACGGCGGGGGATTAAATACGAATTGA 

SEQIDNO:43 

3F3 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAATTTTGCGGGAATT 
AAGAGCTATCTC6AATCTCAGGGCTGGTCACG 
GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 
AGACCGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGCAAT^GGCTTTAGACGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGTGGCGCGAACACACTTTACTAC 

A i AAAAAA i\- i\jVjACoVjC VjVa AAA 1 AAAA ± L\iA 

AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 
TGACGACAAGCAAGGCGCTTCCGGGAACAGAT 
CCAAATCAAAAGATTTTATACACATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATCCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GACTGAACGGCGGGGGCCAGAATACGAATTGA 

SEQIDNO:44 

2D11(G2) 

TGAACACAATCCAGTTGTTATGGTTCACiGGTA 
TCGGAGGGGCATCATTCAATTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 
GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 
GGACAGGGAGGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGAAAAAGGTATTAGATGA 
AACCGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGGGGCGCGAACACACTTTACTAC 

A i AAAAAA I i.\jt7A(-(jVjUlj(jrAAA i AAAA i 1\jA 

AAACGTCGTCACACTTGGCGGCGCGAACCGTT 
CGACGACAAGCAAGGCGCTTCCGGGAACAGAT 
CCAAATCAAAAGATTTTATACACATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GGCTGAACGGCGGAGGCCAGAATACGAATTGA 



IvjAAVJA^-AAX l^iJAvj 1 1 vj X X A XVivJ X X\-iAV-fVjVj X A 

TTGGAGGGGCATCATTCAATTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCGCG 
GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 
GGACAGGGACGAATTATAACAATGGCCCGGTA 
TOATCACGATTTGTGCAAAAGGTTTTAGACGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGGGGCGCGAACACACTTTACTAC 
ATAAAAAATCTGGACGGCGGT^TAAAATTGA 
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AAACGTCGTCACACTTGGCGGCGCGAACCGTT 
CGACGACAAGCAAGGCGCTTCCGGGAACAGAT 
CCTy^TCAAAAGATTTTATACACATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GGCTTAACGGCGGGGGCCACAATACGAATTGA 

SEQ ID NO: 46 

G2.3 

TGAACACAATCCAGTCGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAATTTTGCGGGAATA 
AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 
GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 
GGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGCAAAAGGTTTTAGACGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGTGGCGCGAACACACTTTACTAC 

AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 

CGACGACAAGCAAGGCGCTTCCGGGAACAGAT 

CCAAATCAAAAGATTTTATACACATCCATTTA 

CAGCAGTGCCGATATGATTGTCATGAATTGCT 

TATCAAAATTAGACGGTGCTAAAAACGTTCAA 

ATTC^TGGCGTTGGGCACATTGGTTTATTO 

GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 

GACTGAACGGCGGGGGCCAGAATACGAATTGA 

SEQE)NO:47 

2A3 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGGCATCGTTCAATTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCGCG 
GGACAAGCTGTATGCAGTTGATTTCAAAGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGAAAAAGGTATTAGATGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGCGGCGCTAACACGCTTTACTAC 

A T A A A n A A T^^TTin A PnriP A A A T A A A ATTVi A 

AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 
CGACGACAAGCAAGGCGCTTCCGGGTACTGAT 
CCCAACCAAAAGATCTTGTACACATCC6TTTA 
CAGTAGTGC^roATATGATTGTTATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATTCATGGCGTTGGGCACATl^TTTATTGAT 
GAACAGCCAAiSTCAACAGCCTGATTAAAGAAG 
6ACTGAACGGCGGAGGCCTAAATACAAATTGA 

SEQ ID NO: 48 

2F4 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAATTTTGCGC^ 
AAGAGCTATCTCGTATCTCAGGGCTGGTCGCG 
GGACAAGCTGTATGCGGTTGATTTTTGGG^ 
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AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGAAAAAGGTATTA6ATGA 
AACCGGTGCGAAAAAAGT6GATATTGTCGCTC 
ACAGCATGGGTGGCGCTAACACGCTTTACTAC 

TK ITTA TV TV 7\ TV TV mOnv^/^TV TV m TV TV TV TV ITUTV^ TV 

GAACGTCGTAACACTTGGCGGCGCGAACCGTT 
CGACGACAAGCAAGGCGCTTCCGGGAACAGAT 
CCAAATCAAAAGATCTTGTACACATCCGTTTA 
CAGTAGTGCTGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GGCTGAACGGCGGAGGCCAGAATACGAATTGA 

SEQIDNO:49 

2B9(G2) 

TGAACAC^UVTCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAATTITCCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCGCG 
GGACAAGCTGTATGCAGTTGATTTTTGGGGCA 
AGACAGGGACGAATTATAACAATGGfCCCGGTA 
TTATCGCGTTTTGTGAAAAAGGTATTAGATGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGGGGCGCGAACACACTTTACTAC 

TV rmv TV TV TV TV Ti m/^rrv^^^ n ^^^^^/^/^ iv TV Tv rnTv tv tv tvmu i tv 

ATAAAAAATCTGK5ACGGCGGAAATAAAATTGA 

AAACGTCGTAACACTTGGCGGCGCGAACCGTT ' 

CGACGACAAGCAAGGCGCTTCCGGGAACAGAT 

CCAAATCAAAAGATTTTATACACATCCATTTA 

CAGCAGTGCCGATATGATTGTCATGAATTACT 

TATCAAAATTAGACGGGGCTAAAAATGTTCAA 

ATTCATGGCGTTGGGCACATTGGTTTATTGAT 

GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 

GACTGAACGGCGGAGGCCAAAATACGAATTGA 

SEQIDNO:50 

2C5 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAATTTTGCGGGAACT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 
GGGCAAGCTGTATGCAGTTGATTTTTGGGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
TOATCGCGTTTTGTGAAAAAGGTATTAGATGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 

ATAAAAAATCTGGATGGCGGTAATAAAATTGA 
AAACGTCGTCACACTTGGCGGCGCGAACCGTT 
CGACGACAAGCAAGGCGCTTCCGGGAACTGAT 
CCCAACCAAAAGATTTTATAGACATCCATTTA 
CT^GCAGTGCCGATATGATTGTCATGAATTACT 
TATCAATy^TTAGACGGTGCTAAAAACGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
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GACTGAACGGCGGAGGCCAAAATACGAATTGA 

SEQIDN0:51 

KV1(2A6) 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAATTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 
GGGCAAGCTGTATGCGGTTGATTTCAAGGACA 
AGACAGGCACAAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGAAAAAGGTATTAGATGA 
AACCGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGCGGCGCTAACACGCTTTACTAC 
ATA A A A A A APfi<^nfiAJU^TAAAATTGA 
AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 
CGACGACAAGCAAGGCGCTTCCGGGTACTGAT 
CCCAACCAAAAGATTTTATACACATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GGCTTAACGGCGGGGGCCAGAATACGAATTGA 

SEQIDNO:52 

2D13(G2) 

TAAACACAATCCAGTTGTTATGGTTCACGGTA 
TTGGAGGGGCATCATTCAATTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCGCG 
GGACGAGCTGTATGCGGTTGATTTTTGGGACG 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGCAAAAGGTTTTAGACGA 
AACCGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGTGGCGCGAACACACTTTACTAC 

ATAAAAAATpfrnnAPf^PRfiA AATAAAATTf?A 

AAACGTCGTAACGCTTGGCGGCGCGAACCGTT 
CGACGACAAGCAAGGCGCTTCCGGGTACAGAT 
CCAAATCAAAAGATTTTATACACATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAATGTTCAA 
ATTCT^TGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GGCTGAACGGCGGAGGCCAAAATACGAATTGA 

SEQIDNO: 53 

3C8 

OXSAACACAATCCAGTTGTTATGGTTCACGGTA 
TCGGAGGGGCATCATTCAATTTTGCGGGAAOT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCGCG 
GGACAAGCTGTATGCGGTTGATTTTTGGGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
. TTATCACGATTTGTGCAAAAGGTTTTAGACGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGTGGCGCGAACACACTTTACTAC 
ATAAAAAATCTGGACGGCGGAAATAAAGTTGA 
AAACGTCGTAACACTTGGCGGCGCGAATCGTT 
CGACX3ACAAGCAAGGCGCTTCCGGGAACAGAT 
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CCAAATCAAAAGATTTTATACACATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAACGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GACTGAACGGCGGGGGCCAAAATACAAATTGA 

SEQIDNO: 54 

2D5 

TGAACACAATCCAGTTGTTATGGTTCACGGTA 
TCGGAGGGGCATCATTCAATTTTGCGGGAATT 
AAGAGCTATCTCGTATCTCAGGGCTGGTCACG 
GGGCAAGCTGTATGCGGTTGATTTTTGGGACA 
AGACAGGGACGAATTATAACAATGGCCCGGTA 
TTATCACGATTTGTGCAAAAGGTTTTAGACGA 
AACGGGTGCGAAAAAAGTGGATATTGTCGCTC 
ACAGCATGGGTGGCGCGAACACACTTTACTAC 
ATAAAAAATCTGGACGGCGGAAATAAAATTGA 
AAACGTCGTAACACTTGGCGGCGCGAACCGTT 
CGACGACAAGCAAGGCGCTTCCGGGAACAGAT 
CCAAATCAAAAGATTTTATACACATCCATTTA 
CAGCAGTGCCGATATGATTGTCATGAATTACT 
TATCAAAATTAGACGGTGCTAAAAATGTTCAA 
ATTCATGGCGTTGGGCACATTGGTTTATTGAT 
GAACAGCCAAGTCAACAGCCTGATTAAAGAAG 
GGCT6AACGGCGGAGGACAAAATACAAATTGA 

SEQIDNO: 55 

405 (pumilus) 

MKFVKRRIIALVTILVLSVTSLFAMQPSAKAA 

EHNPWMVHGIGGASYISIFAGIKSYLVSQGWSR 

GKLYAVDFVTOKTGTNYNNGPVLSRFVQKVLDE 

TGAKKVDIVAHSMGGANTPYYIKlSniilXSCSSrKI 

NWTLGGANRSTTSKALPGTDPNQKILYTSIY 

SSADMrVMNYLSKLDGAKNAQIHGVGHIGLLM 

NSQVNSLIKEGIiNGGGQNTN 

SEQIDNO: 56 

406 (subtilis) 

MKFVKmil lALVTILMLSVTSLFALQPSAKAA 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 

DKLYAVDFWDKTGTNYISINGPVLPRFVQKVI^ 

TGAKKVDIVAHSMGGANTLYYIKNLIX3<S^^ 

NVVTLGGAimiTTGPCALPGTDPNQKILYTSIY 

SSADMIYIJSrnjSRLDGARNVQIHGVGHIGLLY 

SSQVNSLIKEGLNGGGLNTN 

SEQIDNO: 57 

402 (megaterium) 

MKFVKRRIIALVTILVLSVTSLFAMQPSAKAA 

DTIQLLWFTGIGGASYNFAGIKSYLVSQGWSR 

GIOiYAVDFVTOKTGIOTmGPVLSRFVQKVLDE 

TGAKKVDIVAHSMGGAOTLYYIKNIJXSGl^ 

NVVTLGGANRIiTTSKALPGTDPNQKILYTSIY 

SSADMIVMJSr^LSIOiDGAKWQIHGVGHIGLm 

NSQVNSLIKEGLNGGGHNTN 

SEQIDNO: 58 

400(lentus) 

MKFVKEIEIIIALVTILVLSVTSLFAMQPSAKAA 
EmJPWMVHGIGGASYNFAGIKSYLVSQGWSR 
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GKLYAVDFWDKTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGaNKIE 
NWTLGGANRLTTSKALPGTDPNQKILYTSIY 
SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGLNTN 

SEQ ID NO: 59 

396 (circulans) 

MKFIKRRIIALVTILVLSVTSLFAMQPSAKAA 
EHNPWMVHGIGGASYNFAGIKSYLVSQGWSR 
GKLYAVDFWDKTGTtm^NGPVLSRPVQKVLDE 
TGAKKTOIYAHSMGGMTTLYYIKlSniiDGGNKIE 
NVVTLGGANRLTTSKMiPGTDPNQKILYTSIY 
SSADMIVMlSr^SKLDGAKNVQIHGVGHIGLIiM 
NSQVNSLIKEGLNGGGLNTN 

SEQIDNO:60 

392 

(azotofonnans) 

MKFVKRRI I ALVTILVLSVTSLFAMQPSAKAA 
EHNPWMVHGIGGASYNFAGIKSYLVSQGWSR 
GELYAVDFWDKTGTNYNNGPVLSRFVQKVLDE 
TGAIQCVDIVAHSMGGANTLYYIKNLDG<aJKIE 
NWTLGGANRLTTSKALPGTDPNQKILYTSIY 
SSANMIVMNYLSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGLDTN 

SEQ ID NO: 61 

398 (firmus) 

MKFVKRRI I ALVTILVLSVTSLFAMQPSAKAA 
EHNPWMVHGIGGASYNFAGIKSYLVSQGWSR 
GKLYAVDFWDKTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 
NWTLGGANRLTTSKALPGTDPNQKILYTS I Y 
SSADMIVMNYLSKLDGAKNAQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGHNTN 

SEQ ID NO: 62 

393 (badius) 

MKFVKRRI I ALVTILVLSVTSLFAMQPSAKAA 
EHNPWMVHGIGGASYNFAGIKSYLVSQGWSR 
GKLYAVDFWDKTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 
NWTLGGANRLTTSKALPGTDPNQKILYTS I Y 
SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGHNTN 

SEQ ID NO: 63 

Dc5h 

MKFVKRRI lALVTILMLSVTSLFALQPSAKAA 
EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
DKLYAVDFKDKTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKVE 
NWTLGGANRLTTGKALPGTDPNQKILYTSIY 
SSADMTVMNYLSRLDGARNVQIHGVGHIGLLY 
SSQVNSLIKEGLNGGGLNTN 

SEQ ID NO: 64 

Dc5f 

MKFVKRRI lALVTILMLSVTSLFALQPSAKAA 
EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
DKLYAVDFXDKTGNNRNNGPRLSRFVKDVLDK 
TGAKKVDIVAHSMGGANTLYYIKNLDGGDKIE 
NWTIGGANGLVSSRALPGTDPNQKILYTSVY 
SSADLIWNSLSRLIGARNILIHGVGHIGLLT 


139 


wo 02/06457 


PCT/USOl/22160 




SSQVKGYIKEGLNGGGLNTN 

SEQIDNO:65 

Dc5cl 

MKVIFVKKRSLQILVALALVLGSIAFIQPKEA 
KAAEHNPWMVHGMGGASYNFASIKRYLVSQG 
WDQNQLFAIDFIDKTGNNLNNGPRLSRFVKDV 
LAKTGAKKVDIVAHSMGGANTLYYIKNLDGGD 
KIENWTLGGl^GLVSLRALPGTDPNQKILYT 
SVYS SADL I WNSL SRL IGARNVL IHGVGHIG 
LLTSSQVKGYVKEGIiNGGGQNTN 

SEQIDNO: 66 

Dc5a2 

MKVIFVKKRSLQILVVIiALVMGSMAFIQPKEI 

RAAEHNPWMVHGMGGASYNFASIKSYLVSQG 

WDIOTQLFAIDFIDKTGNNM^GPRLSRFVKD 

LAKTGAKKVDIVAHSMGGANTLYYIKNLIXM 

KIENWTLGGANGLVSLRALPGTDPNQKILYT 

SVYSSADLIWNSLSRLIGARNVLIHGVGHIG 

LLASSQVKGYIKEGLNGGGQNTN 

SEQIDNO:67 

Dc512 

MKVIFVKKRSLQILIAiiALVIGSMAFIQPK^ 
KAAEHNPWMVHGIGGASYNFFSIKSYLATQG 
WDRNQLYAIDFIDKTGISINI^GPRLSKFVKDV 
LDKTGAKKVDIVAHSMGGANTLYYIKNLDGGD 
KIENVVTIGGANGLVSSRALPGTDPNQKILYT 
SWSSADLIVVNSLSQFIOTRKKHPDPGVGHIG 
LLTSSQVKGYIKEGLNGGGLNTN 

SEQIDNO: 68 

Sga 

MKFVKRRI lALVTIIiMLSVTSLFALQPSAKAA 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 

DKLYAVDFRDKTCTJMLilS^ 

TGAKKVDIVAHSMGGANTLYYIKNLIX5GNKIE 

NWTLGGANRLVTGKALPGTDPNQKILYTSVY 

SSADMIVMNYLTKLDGAKNVQIHGVGHIGLLY 

SSQVNSLIKEGLNGGGLNTN 

SEQIDNO: 69 

Sgc 

MKFVKRRIIALVTILMLSVTSLFALQPSAKAA 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 

DKLYAVDFVTOKTGNlSrLlSINGPVLSRFV^^ 

TGAKKVDIVAHSMGGAOTLYYIKNLDGGNKIE 

NWTLGGANRLVTGKALPGTDPNQKILYTSVY 

SSADMIVMlSnfLSKLDGAKNVQIHGVGHIGLIiY 

SSQVNSLIKEGLNGGGLNTN 

SEQIDNO: 70 

Sgd 

MKFVKRRI lALVTILMLSVTSLFALQPS AKAA 
EHNPVVMvHGIGGASFNFAGIKSyLVSQGWSR 
DKLYAVDFSDKTGNNLNNGPVLSRFVKKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 
NWTLGGANRLVTGKALPGTDPNQKILYTSVY 
SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLY 
SSQVNSLIKEGLNGGGLNTN 

SEQIDNO: 71 

Sgf 

MKFVKRRI lALVTILMLSVTSLPALQPSAKAA 
EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
DKLYAVDFKDKTCaiNRNNGPRLSRFVKDVLDK 
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TGAKKVDIVAHSMGGANTLYYIKNIiDGGDKIE 
NWTIGGANGLVSSRALPGTDPNQKILYTSVY 

SSQVKGYIKEGIiNGGGLNTN 

SEQIDNO: 72 

Sgh 

MKFVKRRILALVTILMLSVTSL^ 

EHNPVVMVHGIGGASFHFAGIKSYLVSQGWSR 

DKLYAVDFIDKTG]S!NR1SINGPRLSRFVKDV^ 

TGAKK\roiVAHSMGGANTLYYIKNLDGGDKIE 

NWTIGGANGLVSSRALPGTDPNQKILYTSVY 

S S AT^T . T VUKfCIT . C!"R T . Tn 2VP"Kr\7"n TWn\rf2M TP T .T 
iDiDnuxjj. V viNOiJOJxiJX\jjarLLTi V y xriVj v oriX\^ JL 

SSLVKGYIKEGLNGGGQNTN 

SEQIDNO: 73 

Mt2bl 

MKVIFVKKRSLQ ILVALALVIGSMAFIQPKEI 

KAAEHNPWMVHGIGGASYNFASIKSYLVNQG 

WDRNQLFAIDFIDKTCTINRNNGPRLSRFV^ 

LDKTGAKKVDIVAHSMGGANTLYYIKNLDGGD 

KIENWTIGGANGLVSLRALPGTDPNQKILYT 

LLTSSQVKGYIKEGLNGGGLNTN 

SEQIDNO: 74 

H2a 

MKFVKIUIIIALVTILMLSVTSLFALQPSAKA^ 
EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
DKLYAVDFRDKTGNNRISnSfGPRLSKFV^ 
TGAKKVDIVAHSMGGANTLYYIKNIiDGGDKIE 

KLSRSHCRQQSLSFNWLQETVQIHGVGHIGLL 
TSSQVKGYIKEGLNGGGLNTN 

SEQIDNO: 75 

lfl5(G2) 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 

GKLYAVDFVTOKTGTNYWNGPVLSRFVKKV^ 

TGAiaCVDIVAHSMGGANTLYYIK]SrLDGGNK\^ 

"KTWTT /^JOTTOTR Clnvp o A T . Pfi'PTl PTVim?" T T . VT^ Q T V 
±w V xxjV7V7jLx>frs.iDX X t^x\^\ijir\^ ±ijrVi\^r\.j.±jX X X 

SSADMIVMlSnn^SKLDGAKWQIHGVGHIGLm 

NSQVNSLIKEGLNGGGLNTN 

SEQIDNO: 76 

3C12 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 

GKLYAVDFWDKTGTiraiNGPVLSRFVKDVL^ 

TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 

V V X XJO\3rU>iJ\.0 X X O J\x>LiXro X UlzSi}i\JjS,d.l-iX X O -L X 

SSADMTVMimiSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGIiNTN 

SEQ ID NO: 77 

3N19(G2) 

EHNPWMVHGIGGASFNFAGIKSYTjVSOGWSR 

GKLYAVDFWDRTCT^rraNGPVLSRFVKKVl^E 

TGAKKOT>IVAHSMGGA]SrrLYYIKNIiDGG^ 

IWVTLGGANRLTTSKALPGTDPNQKILYTSIY 

GSADMIVMNYLSKLDGAKNVQIHGVGHIGLIiM 

NSQVNSLIKEGLNGGGLNTN 

SEQIDNO: 78 

G2.2 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
GKLYAVDFWDKTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 
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NWTLGGANRSTTSKALPGTDPNQKILYTS I Y 
GSADMIVMimiSKLIX5AKlWQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGLNTN 

SEQIDNO: 79 

2C3 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 

GKLYAVDFWDKTGTNYimGPVLSRWQKVIiD 

TGAKK\miVAHSMGGA]OTLYYIKNLDGGNKIE 

NWTIGGANGLVS SRALPGTDPNQKILYTS[VY 

SSADLIWNSLSRLIGARNVQIHGVGHIGLLT 

SSQVKGYIKEGIiNGGGHNTN 

SEQIDNO: 80 

2F11 

EHNPWMVHGIGGASYNFAGIKSYLVSQGWSR 
GKLYAVDFVTOKTGTNY^GPVLSRFVQKVLDE 
TGAKKVDWAHSMGGANTLYYIKNLDGGNKIE 
NWTLGGANRLTTSRALPGTDPNQKILYTSIY 
SSADMIVMimjSKEiDGAKNVQIHGVGHIGLI^ 
NSQVKGYIKEGLNGGGLNTN 

SEQIDNO: 81 

KV11(6C7) 

EHNPWMVHGIGGASFSFAGIKSYLVSQGWSR 
GKLYPVDFVTOKTGTISrmSFGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGAOTLYYIKNLDGGNKIE 
NWTLGGANKLTTSKALPGTDPNQKILYTSVY 
SSADMIVMimiSKLDGAKNVQIHGVGHIGLIiM 
NSQVNSLIKEGLNGGGLNTN 

SEQIDNO: 82 

KV6(3A1) 

EHNPWMVHGIGGASFSFAGIKSYLVSQGWSR 
GKLYAVDFWDKTGTNYlSfNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 
NWTLGGANRLTTSKALPGTDPNQKILYTSVY 
SSADMIVJyttlYLSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGLNTN 

SEQIDNO: 83 

KV2(2D1) 

EHNPWMVHGIGGASYSFAGIKSYLVSQGWSR 
GKLYAVDFWDKTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 
NWTLGGANRLTTSKALPGTDPNQKILYTSVY 
SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLY 
SSQVNSLIKEGLNGGGQNTN 

SEQIDNO: 84 

N2.5 

EHNPVVMVHGIGGASYSFAGIKSYLVSQGWSR 
GKLYAVDFWDKTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 
NWTLGGANRLTTSKALPGTDPNQKILYTSVY 
SSADMIVMNYLSKLDGAKNVQIHGVGHTGLL^ 
NSQVNSLIKEGLNGGGHNTN 

SEQIDNO: 85 

KV5(2H6) 

EHNPWMVHGIGGASYNFAGIKSYLVSQGWSR 
GKLYTVDFWDKTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 
NWTLGGANRLVTGKALPGTDPNQKILYASVY 
SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGLNTN ' 
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SEQ ID NO: 86 

3E5 . 

EHNPWMVHGIGGASFNFAGIRSYLVSQGWSR 
GKLYAVDFWDRTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 

SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLY 
SSQVNSLIKEGIiNGGGLNTN 

SEQ ID NO: 87 

G2.1 

EHNPWMVHGIGGASFNFAGIRSYLVSQGWSR 
GKLYAVDFVTOKTGTISnn^GPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGAOTLYYIKNLIX3G3S[I^ 

i>i vV X Xjovj X XMXvLi X X ol\AXjirVj i LrirJN Vf -f^J--!-' ^ i O X X. 

SSADMIVMimiSKLIXSAKWQIHGVGHIGLriM 
NSQVNSLIKEGLNGGGLNTN 

SEQ ID NO: 88 

3H24(G2) 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
DKPYAVDFWDKTGTNYNNGPVLSRFVQKVLDK 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKV^ 

i\J V V X JjWjiilNIKiji 1 olS-tyjF(jXUirJ>iyjS.xlji i O J. X 

SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGIiNTN 

SEQ ID NO: 89 

KV10(4G6) 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWPR 
DKLYAVDFVTOKTGTNYlSnsrGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLIXSGNK^ 
oVv XJjij<jAJNKijVlAjlS-tyjFljiL>irW loXx 

SSADMIViynSfYLSKLDGAiaWQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGHNTN 

SEQ ID NO: 90 

KV12(6D4) 

EHNPWMVHGIGGASFSFAGIRSYLVSQGWPR 

DKLYAVDFTWDKTGTNYNNGPVLSRFVQ 

TGAKKVDIVAHSMGGAJSPTLYYIKNLIXSGNK^ 

XM VV XXjvioriXNIKXjX X\j JSALi Ir Va ii-'iri^yJXXXjX X o v X 

SSADMIVMOTiSKLDGAKNVQIHGVGHIGLLM 
NSQVNRLIKEGLNGGGHNnsr 

SEQ ID NO: 91 

N2.2 

EHNPWMVHGIGGASFSFAGIRSYLVSQGWPR 
DKLYAVDFWDKTGTNYISINGPVLSRFVQKVLDE 
TGAKKVT)IVAYSMGGANTLYYIKNLDG<asrKVE 

XN V V X lAiVj/UNlKXj X X\5J\AXjlrV3 i JJjrJNyjN-XiJ X 1 o V x 

SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLM 
NSQVNRLIKEGLNGGGHimr 

SEQ ID NO: 92 

N2.3 

EHNPWMVHGIGGASFSFAGIRSYLVSQGWPR 

TGAKKVDIVAHSMGGANTLYYIKNLDGGNKVG 
NWTLGGANRLTTGKALPGTDPNQKILYTSVY 
SSADMiVMimiSKLDGAKIWQIHGVGHIGLriM 
NSQVNRLIKEGLNGGGHNTN 

SEQ ID NO: 93 

N2.1 

EHNPWMVHGIGGASFSFAGIRSYLVSQGWPR 
DKLYAVDFWDKTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKVE 
NVVTLGGANRLTTGKALPGTDPNQKILYTSVY 
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SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLM 
NSQVNRLIKEGLNGGGHNTN 

SEQIDNO:94 

KV4(2E12) 

EHNPWMVHGIGGTSFNFAGIKSYLVSQGWSR 
DKLYAVDFWDKTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 

SSADMIVMNYLSK]^GAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGHNTN 

SEQIDNO:95 

KV9(4C6) 

EHNPWMVHGIGGASFSFAGIKSYLVSQGWSR 
DKLYAVDFSDKTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 

SSADMVViyDSrnjSKLDGAKNVQIHGVGHIGLLM 
NSQWSLIKEGIiNGGGHNTN 

SEQIDNO: 96 

7D6 

KHNPWMVHGIGGASYNFAGIKSYLVSQGWSR 
DKLYAVDFSDKTGmraiNGPVLSRFVQKVLDE 
TGAKKVDrVAHSMGGANTLYYIKlSniiDGGr^ 

SSADMIVMNYLSKXiDGAKWQIHGVGHIGLIM 
NSQVNSLIKEGLNGGGLNTN 

SEQIDNO:97 

3F3 

EHNPWMVHGIGGASFNFAGIKSYLESQGWSR 
GKLYAVDFVOKTGTlSrZNNGPVLSRFVQKALDE 
TCAKKVDIVAHSMGGANTLYYIKNLIXS^^ 

SSADMIVMimiSKLDGAKNVQIHGVGHIGLIiM 
NSQVNSLIKEGIiNGGGQNTN 

SEQIDNO: 98 

2D11(G2) 

EHNPWM\7HGIGGASFNFAGIKSYLVSQGWSR 
GKIiYAVDFWDRTGTJSrmWGPVLSRFVKKVLDE 
TGAKIO/DIVAHSMGGAOTLYYIKNLIXSGN^^ 

TvnnTTPT r?r»a"KTDCT>TC'K*aT prirp'npMm^'TT.VTCiTV 
JMV V 1 JjiivjiilNrto i i i2JSAljr\3 llJtrvi\^r>^±LtJL J-oX X 

SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGQNTN 

SEQIDNO: 99 

3C23(G2) 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
GKLYAVDFVTORTGTimmGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNIiDG<3^ 

WVV iijUVjAJNKoi ioJNiVuirl^ loXx 

SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGHNTN 

SEQIDNO: 100 

G2.3 

EHNPVVI^GIGGASFNFAGIKSYLySQGWSR 
GKLYAVDFWDRTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDrwmSMGGANTLYYIKNLIXSGNKIE 
NWTLGGANRSTTSKALPGTDPNQKILYTSIY 
SSADMIVmCLSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGQNTN 

SEQIDNO: 101 

2A3 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
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DKLYAVDFKDKTGTlSra5NGPVLSRFVKK^ 

TGAKKVDIVAHSMGGi^LYYIKNLDG(3NfKIE 

l!5VVTLGGA]SIRSTTSKALPGTDPNQKILYTSVy 

SSADMIVMimiSKLDGAKWQIHGVGHIGLLM 

NSQVNSLIKEGLNGGGLNTN 

SEQIDNO: 102 

2F4 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 

DKLYAVDFVTOKTGTNYT^NGPVLSRFVKK^^ 

TGAKKVDIVAHSMGGANTLYYIKNLDGGDKIE 

NWTLGGANRSTTSKALPGTDPNQKILYTSVY 

SSADMIVMNYLSKLIX3AKWQIHGVGHIGLLM 

NSQVNSLIKEGLNGGGQNTN 

SEQIDNO: 103 

2B9(G2) 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
DKLYAVDFWGKTGTOFYNNGPVLSRFVKKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNIilXSGNK 
NWTLGGANRSTTSKALPGTDPNQKILYTS I Y 
S S ADMIVMNYLSKLDGAKNVQ IHGVGHIGLLM 
NSQVNSLIKEGLNGGGQNTN 

SEQIDNO: 104 

2C5 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
GKLYAVDFWDKTGTira^NGPVLSRFVKKVLDE 
TGAKKVDrVAHSMGGANTLYYIKNLDGGNKIE 
NVVTLGGANRSTTSKALPGTDPNQKIL YTS I Y 
SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGQNTN 

SEQIDNO: 105 

KV1(2A6) 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
GKLYAVDFKDKTGTNYNNGPVLSRFVKKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 
NWTLGGANRSTTSKALPGTDPNQKILYTSIY 
SSADMIVMSr^LSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGQNTN 

SEQIDNO: 106 

2D13(G2) 

KHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
DELYAVDFVTOETGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 
NVVTLGGANRSTTSKALPGTDPNQKILYTS lY 
SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGMIGGGQNTN 

SEQIDNO: 107 

3C8 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
DKLYAVDFVTOKTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKVE 
NWTLGGANRSTTSKALPGTDPNQKILYTSIY 
SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLM 
NSQVNSLIKEGLNGGGQNTN 

SEQIDNO: 108. 

2D5 

EHNPWMVHGIGGASFNFAGIKSYLVSQGWSR 
GKLYAVDFWDKTGTNYNNGPVLSRFVQKVLDE 
TGAKKVDIVAHSMGGANTLYYIKNLDGGNKIE 
NWTLGGANRSTTSKALPGTDPNQKILYTSIY 
SSADMIVMNYLSKLDGAKNVQIHGVGHIGLLM 
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WHAT IS CL ATMPn IR' 

1. An isolated or recombinant polypeptide comprising a sequence having 
at least 97% amino acid sequence identity to any one of SEQ ID NO: 75 to SEQ ID NO: 108. 

2. The polypeptide of claim 1, wherein said polypeptide comprises lipase 

activity. 

3. The polypeptide of claim 1, wh^in said polypeptide degrades geranyl 
butyrate, neryl butyrate, or both geranyl butyrate and neryl butyrate. 

4. The polypeptide of claim 3, wherein said polypeptide exhibits 
eniantioselectivity for geranyl butyrate over neryl butyrate. 

5. The polypeptide of claim 4, comprising a sequence selected from: SEQ 
ID NO:76, SEQ ID NO:79, SEQ ID NO:80, SEQ ED NO:86, SEQ ID NO:96, SEQ ID 
NO:97, SEQ ID NO:101, SEQ ID NO:102, SEQ ID NO:104, SEQ ID NO:107, SEQ ID 
NO:I08, SEQ ID NO:78, SEQ ID NO:87, SEQ ID NO:100, SEQ ID NO:75, SEQ ID NO:77, 
SEQ ID NO:88, SEQ ID NO:98, SEQ ID NO:99, SEQ ID NO:103, or SEQ ID NO:106. 

6. The polypeptide of claim 3, wherein said polypeptide exhibits 
enantioselectivity for neryl butyrate over geranyl butyrate. 

7. The polypeptide of claim 6, comprising a sequence selected from: SEQ 
ID N0:81, SEQ ID NO:82, SEQ ID NO:83. SEQ ID NG:85, SEQ ID NO:89, SEQ ID 
NO:90, SEQ ID NO:94, SEQ ID NO:95, SEQ ID NO:105, SEQ ID NO:84, SEQ ID NO:91, 
SEQ ID NO:92, or SEQ ID NO:93. 

8. The polypeptide of claim 1, comprising a polypeptide encoded by a 
polynucleotide sequence, which polynucleotide sequence hybridizes under highly stringent 
conditions over substantially the entire lengfli of: a polynucleotide sequence selected from 
SEQ ID NO: 1 to SEQ ID NO: 54, or a complementary sequence thereof; or a polynucleotide 
sequence encoding a polypeptide sequence selected from SEQ ID NO: 55 to SEQ ID NO: 
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108i or a complementary sequence thereof; wherein said polypeptide comprises one or more 
of: Lys at position 1; Thr at position 14; Ser at position 17; Arg at position 22; Glu at 
position 26; Pro at position 31; Gly at position 33; Glu at position 34; Pro at position 35; Pro 
or Thr at position 37; Ser or Lys at position 41; Gly at position 42; Arg or Glu at position 43; 
Ala at position 61; Tyr at position 75; Gly at position 96; Ser at position 97; Thr at position 
104; Ser at position 107; Ala at position 125; Gly at position 129; Val at position 134; Cys at 
position 138; Lys at position 141; Lys at position 146; Thr at position 156; Met at position 
160; Arg at position 166; or His at position 177. 

9. The polypeptide of claim 1 comprising one or more of: Lys at position 
1; Thr at position 14; Ser at position 17; Arg at position 22; Glu at position 26; Pro at 
position 31; Gly at position 33; Glu at position 34; Pro at position 35; Pro or Thr at position 
37; Ser or Lys at position 41; Gly at position 42; Arg or Glu at position 43; Ala at position 
61; Tyr at position 75; Gly at position 96; Ser at position 97; Thr at position 104; Ser at 
position 107; Ala at position 125; Gly at position 129; Val at position 134; Cys at position 
138; Lys at position 141; Lys at position 146; Thr at position 156; Met at position 160; Arg at 
position 166; or His at position 177. 

10. The polypeptide of claim 9, wherein said polypeptide has lipase activity. 

11. The polypeptide of claim 9, wherein said polypeptide degrades geranyl 
butyrate, neryl butyrate, or both geranyl butyrate and neryl butyrate. 

12. Hie polypeptide of claim 11, wherein said polypeptide exhibits 
enantioselectivity for geranyl butyrate over neryl butyrate. 

13. The polypeptide of claim 12, comprising one or more of: Arg at position 
22; Gly at position 33; Ser or Lys at position 41; Arg at position 43; Ser at position 107; Lys 
at position 141; Lys at position 146; Met at position 160; or IDs at position 177. 

14. Thepolypeptideof claim 12, comprising one or more of: Arg at position 
43; or Ser at position 107. 
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15. The polypeptide of claim 11, wherein said polypeptide exhibits 
enantioselectivity for neryl butyrate over geranyl butyrate. 

16. The polypeptide of claim 15, comprising one or more of: Ser at position 
17; Arg at position 22; Pro at position 31; Gly at position 33; Ser or Lys at position 41; Lys at 

5 position 141; Lys at position 146; Met at position 160; Arg at position 166; or His at position 
177. 

17. The polypeptide of claim 15, comprising one or more of: Ser at position 
17; Pro at position 31; or Arg at position 166. 

18. An isolated or recombinant polypeptide comprising a sequence having at 
10 least 94% amino acid sequence identity to the mature region of SEQ ID NO: 55, 61, 64, 65, 

67, 68, 70 or 72. 

19. Hie polypeptide of claim 18, wherein said polypeptide comprises a 
sequence having ait least 94% amino acid sequence identity to the mature region of SEQ ID 
NO: 55. 

15 20. The polypeptide of claim 19, wherein said polypeptide comprises a 

sequence selected from SEQ ID NO: 55, 58-62, 75-78, 80-88, or 94-108, or the mature 
region thereof. 

21. The polypeptide of claim 18, wherein said polypeptide comprises a 
sequence having at least 94% amino acid sequence identity to the mature region of SEQ ID 

20 NO: 61. 

22. The polypeptide of claim 21 , wherein said polypeptide comprises a 
sequence selected from SEQ ID NO: 55, 57-62, 75-78, 80-90, or 93-108, or the mature 
region thereof. 

23. The polypeptide of claim 18, wherein said polypeptide comprises a 

25 sequence having at least 94% amino add sequence identity to the mature region of SEQ ID 
NO: 64. 
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24. The polypeptide of claim 23, wherein said polypeptide comprises a 
sequence selected from SEQ ID NO: 64, 71, or 72, or the mature region thereof. 

25. The polypeptide of claim 18, wherein said polypeptide comprises a 
sequence having at least 94% amino acid sequence identity to the mature region of SEQ ID 

5 NO: 65. 

26. The polypeptide of claim 25, wherein said polypeptide comprises a 
sequence selected from SEQ ID NO: 65, 66, or 73, or the mature region thereof. 

27. The polypeptide of claim 18, wherein said polypq)tide comprises a 
sequence having at least 94% amino acid sequence identity to the mature region of SEQ ID 

10 NO: 67. 

28. The polypeptide of claim 27, wherein said polypeptide comprises the 
sequence SEQ ID NO: 67, or the mature region thereof. 

29. The polypeptide of claim 18, wherein said polypeptide comprises a 
sequence having at least 94% amino add sequence identity to the mature region of SEQ ID 

15 NO: 68. 

30. The polypeptide of claim 29, wherein said polypeptide comprises a 
sequence selected from SEQ ID NO: 68 or 101, or the mature region thereof. 

31. The polypeptide of claim 18, wherein said polypeptide comprises a 
sequence having at least 94% amino acid sequence identity to the mature region of SEQ ID 

20 NO: 70. 

32. Thepolypeptideof claim 31, wherein said polypeptide comprises a 
sequence selected from SEQ ID NO: 63, 68-70, 82-83, 85-86, 96, or 101-102, or ttie mature 
region thereof . 
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33. The polypeptide of claim 18, wherein said polypeptide comprises a 
sequence having at least 94% amino acid sequence identity to the mature region of SEQ ID 
NO: 72. 

34. The polypeptide of claim 33, wherein said polypeptide comprises a 
5 sequence selected from SEQ ID NO: 64, 71, or 72, or the mature region thereof. 

35. An isolated or recombinant polypeptide comprising a sequence having at 
least 85% amino acid sequence identity to the mature region of SEQ ID NO: 74. 

36. The polypeptide of claim 35, wherein said polypeptide comprises a 
sequence selected from SEQ ID NO: 63, 71-72, 74, or 79. 

10 37. An isolated or recombinant polypeptide comprising a sequence having at 

least 99% amino acid sequence identity to the mature region of SEQ ID NO: 56. 

38. An isolated or recombinant polypeptide exhibiting enantioselective 
lipase activity, which polypeptide comprises an amino acid sequence of any one of SEQ ID 
NO: 55 Oirbugh SEQ ID NO: 108. 

15 39. An isolated or recombinant polypeptide exhibiting enantioselective 

lipase activity, which polypeptide comprises at least 45 contiguous amino acid residues of a 
polypeptide encoded by a coding polynucleotide sequence, the coding polynucleotide 
sequence selected from the group consisting of: 

(a) a polynucleotide sequence selected from any of SEQ ID NO: 1 to SEQ ID 

20 NO: 54; 

(b) a polynucleotide sequence that encodes a polypeptide selected from any of 
SEQ ID NO: 55 to SEQ ID NO: 108; and, 

(c) a polynucleotide sequence which hybridizes under stringent conditions 
over substantially the entire length of a polynucleotide sequence (a) or (b), or which 

25 hybridizes to a subsequence comprising at least 100 nucleotides thereof, wherein the 

polynucleotide sequence does not comprise a sequence corresponding to any of GenBank 
accession numbers: ll6WA, 1I6WB, A02813, A02815 A34992, AAA22574, AAB31769, 
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AAC12257, AAD30278, AAF40217, AAF63229, ABQOOei?, AF134840, AF141874, 
AF237623, AJ297356, BAA11406, BAA22231, BAB05967, C69652, CAA00273, 
CAA00274, CAA02196, CAA64621, CAB 12064, CAB12664, CAB51971, CAB92662, 
CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, M74010, P37957, 
5 S23934, U78785, X95309, Z99105, and Z99108. 

40. The isolated or recombinant polypeptide of claini 38, wherein the 
polypeptide is enantioselective for either a cis form substrate enantiomer or for a trans form 
substrate enantiomer. 

41. The isolated or recombinant polypeptide of claim 40, wherein the 
10 polypeptide comprises an enantiomeric ratio of at least 2 for the cis form substrate 

enantiomer or for the trans form substrate enantiomer. 

42. The isolated or recombinant polypeptide of claim 40, wherein the 
polypeptide comprises an enantiomeric ratio of at least 5 for the cis form substrate 
enantiomer or for the trans form substrate enantiomer. 

15 43. The isolated or recombinant polypeptide of claim 40, wherein the 

polypeptide comprises an enantiomeric ratio of at least 10 for the cis form substrate 
enantiomer or for the traris form substrate enantiomer. 

44. The isolated or recombinant polypeptide of claim 40, wherein the 
polypeptide conq)rises an enantiomeric ratio of at least 50 for the cis form substrate 

20 enantiomer or for the trans form substrate enantiomer. 

45. Theisolatedorrecombinantpolypeptideof claim 40, wherein the 
polypeptide comprises an enantiomeric ratio of at least 100 for the cis form substrate 
enantiomer or for the trans form substrate enantiomer. 

46. An isolated or recombiiiant polypeptide which is at least 99% or more 
25 identical over a comparison window of 45 contiguous amino acids to one or more of SEQ ID 

NO: 55 to SEQ ID NO: 108. 
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47. An isolated or recombinant polypeptide encoded by a nucleic acid 
comprising a polynucleotide sequence selected from the group consisting of: 

(a) a polynucleotide sequence selected from SEQ ID NO: 1 to SEQ ID NO: 
54, or a complementary polynucleotide sequence thereof; 
5 (b) a polynucleotide sequence encoding a polypeptide selected from SEQ ID 

NO: 55 to SEQ ID NO: 108, or a complementary polynucleotide sequence thereof; 

(c) a polynucleotide sequence which hybridizes under highly stringent 
conditions over substantially the entire length of polynucleotide sequence (a) or (b), or which 
hybridizes to a subsequence thereof comprising at least 100 residues thereof, wherein the 

10 polynucleotide sequence does not comprise a sequence corresponding to any of GenBank 
accession numbers: 1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, 
AAC12257, AAD30278, AAF40217, AAF63229, AB000617, AF134840, AF141874, 
AF237623, AJ297356, BAA11406, BAA22231, BAB05967, C69652, CAA00273, 
CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, CAB92662, 

15 CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, M74010, P37957, 
S23934. U78785, X95309, Z99105, and Z99108; 

(d) a polynucleotide sequence comprising all or a fragment of (a), (b), or (c), 
wherein the fragment encodes a polypeptide comprising lipase activity; and, 

(e) a polynucleotide sequence encoding a polypeptide, the polypeptide 
20 comprising an amino acid sequence which is substantially identical over at least 45 

contiguous amino acid residues of any one of SEQ ID NO: 55 to SEQ ID NO: 108, wherein 
the polynucleotide sequence does not comprise a sequence corresponding to any of GenBank 
accession numbers: 1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, 
AAC12257, AAD30278, AAF40217, AAF63229, Afi000617, AF134840, AF141874, 
25 AF237623, AJ297356, BAAl 1406, BAA22231, BAB05967, C69652, CAA00273, 

CAA00274» CAA02196, CAA64621, CAB12064, CAB 12664, CAB51971, CAB92662, 
CAB95850, D78508, E0134O, E01903, E02083, E05047, JW0068, M74010, P37957, 
S23934, U78785, X95309, Z99105, and Z99108, 
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48. A polynucleotide sequence encoding a polypeptide comprising lipase 
activity produced by mutating or recombining one or more polynucleotide sequence of claim 
47. 

49. The isolated or recombinant polypeptide of claim 47, the polypeptide 
5 comprising an amino acid sequence of any one of SEQ ID NO: 55 to SEQ ID NO: 108. 

50. The isolated or recombinant polypeptide of claim 47, wherein the 
encoded polypeptide exhibits lipase activity. 

51. ITie isolated or recombinant polypeptide of claim 50, wherein the 
encoded polypeptide exhibits enantioselective lipase activity. 

10 52. The isolated or recombinant polypeptide of claim 50, wherein the 

encoded polypeptide exhibits lipase activity with respect to tributyrin. 

53. The isolated or recombinant polypeptide of claim 50, wherein the 
. encoded polypeptide exhibits lipase activity with respect to tributyrin in DMF. 

54. The isolated or recombinant polypeptide of claim 50, wherein the 

15 encoded polypeptide exhibits lipase activity with respect to tributyrin after heat treatment. 

55. The isolated or recombinant polypeptide of claim 50, wherein the 
encoded polypeptide exhibits enantioselective lipase activity with respect to neryl- butyrate. 

56. The isolated or recombinant polypeptide of claim 50, wherein the 
encoded polypeptide exhibits enantioselective lipase activity with respect to geranyl- 

20 butyrate. 

57. The isolated or recombinant polypeptide of claim 50, v^^erein the 
encoded polypeptide exhibits lipase activity with respect to methyl esters. 

58. The isolated or recombinant polypeptide of claim 50, wherein the 
encoded polypeptide exhibits lipase activity with respect to pentadecanolide. 
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59. The isolated or recombinant polypeptide of claim 50, wherein the 
encoded polypeptide exhibits lipase activity with respect to oxacyclotridecan. 

60. An isolated or recombinant polypeptide comprising at least 45 
contiguous amino acid residues of any of the polypeptides of claim 47, wherein the 

5 polypeptide sequence does not comprise a sequence corresponding to any of GenBank 
accession numbers: 1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, 
AAC12257, AAD30278, AAF40217, AAF63229, AB000617, AF134840, AF141874. 
AF237623, AJ297356, BAA11406, BAA22231, BAB05967, C69652. CAA00273, 
CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, CAB92662, 
10 CAB95850,D78508,E01340,E01903,E02083.E05047,JW0068.M74010,P37957, 
S23934, U78785, X95309, Z99105, and Z99108. 

61. The isolated or reconabinant polypeptide of claim 60, which is 
substantially identical over at least 180 contiguous amino acids of the encoded polypeptide, 
wherein the polypeptide sequence does not comprise a sequence corresponding to any of 

15 GenBank accession numbers: 1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, 
AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617. AF134840, 
AF141874, AF237623, AJ297356, BAAl 1406. BAA22231, BAB05967, C69652, 
CAA00273, CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, 
CAB92662, CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, M74010, 

20 P37957, S23934, U78785, X95309, Z99105, and Z99108. 

62. Hie isolated or recombinant polypeptide of claim 60, which is 
subistantially identical over at least 212 amino acids of the encoded polypeptide, wherein the 
polypeptide sequence does not comprise a sequence corresponding to any of GenBank 
accession numbers: 1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, AAB3 1769, 

25 AAC12257, AAD30278, AAF40217, AAF63229, AB000617, AF134840, AF141874. 
AF237623, AJ297356, BAA11406. BAA22231, BAB05967, C69652, CAA00273, 
CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, CAB92662, 
CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, M74010, P37957, 
S23934, tr78785, X95309, Z99105, and Z99108. 
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63, The isolated or recombinant polypeptide of claim 60, which is 
substantially identical over at least 213 amino acids of the encoded polypeptide, wherein the 
polypeptide sequence does not comprise a sequence corresponding to any of GenBank 
accession numbers: 1I6WA, 1I6WB, A02813, A02815^4992, AAA22574, AAB31769, 
5 AAC12257,AAD30278,AAF40217,AAF63229,AB000617,AF13484O,AF141874^ 
AF237623, AJ297356, BAA11406. BAA22231, BAB05967, C69652, CAA00273, 
CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, CAB92662, 
CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, M74010, P37957, 
S23934, U78785, X95309, Z99105, and Z99108. 

10 64. The isolated or recombinant polypeptide of claim 60, which is 

substantially identical over at least 215 amino acids of the encoded polypeptide, wherein the 
polypeptide sequence does not comprise a sequence corresponding to any of GenBank 
accession numbers: 1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, 
AAC12257, AAD30278, AAF40217, AAF63229, AB000617, AF134840, AF141874, 

15 AF237623, AJ297356, BAA11406, BAA2223i, BAB05967, C69652, CAA00273, 

CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, CAB92662, 
CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, M74010, P37957, 
S23934, U78785, X95309, Z99105, and Z99108. 

65. The polypeptide of claim 1, 1 8, 35, 36, or 47, comprising a leader 

20 sequence. 

66. The polypeptide of claim 1, 18, 35, 36, or 47, comprising a precursor 

polypeptide. 

67. The polypeptide of claim 1, 18, 35, 36, or 47, wherein the polypeptide 
comprises a secretion signal or a localization signal, 

25 68. Hie polypeptide of claim 1, 18, 35, 36, or 47, wherein the polypeptide 

comprises an epitope tag. 
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69. The polypeptide of claim 1, 18, 35, 36, or 47, wherein the polypeptide 
comprises a fusion protein comprising one or more additional amino acid sequences. 

70. The polypeptide of claim 1, 18, 35, 36, or 47, further comprising a 
polypeptide purification subsequence. 

5 71. The polypeptide of claim 70, wherein the polypeptide purification 

subsequence is selected form the group consisting of: an epitope tag, a FLAG tag, a 
polyhistidine sequence, and a GST fusion. 

72. The polypeptide of claim 1, 18. 35, 36, or 47, further comprising a 
methionine residue at the N-tenninus. 

10 73. The polypeptide of claim 1, 18, 35, 36, or 47, wherein the polypeptide 

further comprises a modified amino acid 

74. The polypeptide of claim 73, wherein the modified amino acid is 
selected firom the group consisting of: a glycosylated amino add, a PEGylated amino acid, a 
famesylated amino acid, an acetylated amino acid, a biotinylated amino acid, an amino acid 

15 conjugated to a lipid moiety, and an amino acid conjugated to an organic derivatizing agent. 

75. A composition comprising one or more polypeptide of claim 73 and a 
pharmaceutically acceptable excipient. 

76. A composition comprising one or more polypeptide of claim 1, 18, 35, 
36, or 47, and a pharmaceutically acceptable exdpienL 

20 77. A polypeptide which comprises a unique subsequence in a polypeptide 

selected from SEQ ID NO: 55 to SEQ ID NO: 108, wherein the unique subsequence is 
unique as compared to a polypeptide sequence corresponding to an amino add sequence or 
encoded by a nucleic acid sequence correspondiag to any of GenBank accession numbers: 
1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, AAC12257, 

25 AAD30278. AAF40217, AAF63229, AB000617, AF134840. AF141874, AF237623, 
AJ297356, BAA11406, BAA22231, BAB05967, C69652, CAA00273, CAA00274, 
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CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, 
D78508, E01340, E01903. E02083, E05047, JW0068, M74010, P37957, S23934, U78785, 
X95309, Z99105, and Z99108. 

78. A polypeptide which is specifically bound by a polyclonal antisera raised 
5 against at least one antigen, which at least one antigen comprises at least one amino acid 
sequence of SEQ ID NO: 55 to SEQ ID NO: 108, or a fragment thereof, wherein the antisera 
is subtracted with a polypeptide sequence corresponding to an amino acid sequence or 
encoded by a nucleic acid sequence corresponding to any of GenBank accession numbers: 
1I6WA, 1I6WB, A02813. A02815,A34992, AAA22574. AAB31769, AAC12257, 
10 AAD30278, AAF40217, AAF63229. AB000617, AF134840, AF141874, AF237623, 
AJ297356, BAA11406, BAA22231, BAB05967, C69652, CAA00273, CAA00274, 
CAA02196. CAA64621, CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, 
D78508, E01340, E01903, E02083, E05O47, JW0068, M74010, P37957, S23934, U78785, 
X95309, Z99105, and Z99108. 

15 79. An antibody or antisera produced by administering the polypeptide of 

claim 1,18, 35, 36, or 47 to a mammal, which antibody or antisera specifically binds at least 
one antigen, said at least one antigen comprising a polypeptide comprising any one of the 
amino acid sequences of SEQ ID NO: 55 to SEQ ID NO: 108, which antibody or antisera 
does not specifically bind to a peptide encoded by a nucleic acid corresponding to one or 

20 more of GenBank accession numbers: 1I6WA, 1I6WB, A02813, A02815,A34992, 

AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617, 
AF134840, AF141874, AF237623, AJ297356, BAA11406, BAA22231, BAB05967, 
C69652, CAA00273, CAA00274, CAA02196, CAA64621. CAB12064, CAB12664, 
CAB51971, CAB92662, GAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, 

25 M74010, P37957, S23934, U78785, X95309, Z99105, and Z99108. 

80. An antibody or antisera which specifically binds a polypeptide, the 
polypeptide con^rising an amino acid sequence selected fix)m the ^up consisting of SEQ 
ID NO: 55 to SEQ ID NO: 108, whwein the antibody does not specifically bind to a peptide 
encoded by a nucleic acid corresponding to one or more of GoiBank accession numbers: 
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1I6WA, 1I6WB, A02813. A02815,A34992, AAA22574, AAB31769, AAC12257. 
AAD30278, AAF40217, AAF63229, AB000617, AF134840, AF141874, AF237623, 
AJ297356, BAA11406i BAA22231, BAB05967, C69652, CAA00273, GAA00274, 
CAA02196. CAA64621, CAB12064, CAB12664, eAB51971, CAB92662, CAB95850, 
D78508, E0134O, E01903, E02083. E05047, JW0068, M74010. P37957, S23934, U78785, 
X95309, Z99105, and Z99108. 

81. An isolated or recombinant nucleic acid comprising: a polynucleotide 
sequence selected from the group consisting of: 

(a) a polynucleotide sequence selected from SEQ ID NO: 1 to SEQ ID NO: 
54, or a complementary polynucleotide sequence thereof; 

(b) a polynucleotide sequence encoding a polypeptide selected fmm SEQ ID 
NO: 55 to SEQ ID NO: 108, or a complementary polynucleotide sequence tiiereof; 

(c) a polynucleotide sequence which hybridizes under highly stringent 
conditions over substantially the entire length of polynucleotide sequence (a) or (b), or which 
hybridizes to a subsequence thereof comprising at least 100 residues, wherein the 
polynucleotide subsequence is unique as compared to a sequence corresponding to any of 
GenBank accession numbers: 1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, 
AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617, AF134840, 
AF141874, AF237623, AJ297356, BAA11406, BAA22231, BAB05967. C69652, 
CAA00273, CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, 
CAB92662, CAB95850. D78508, E01340. E01903, E02083, E05047, JW0068, M74010, 
P37957, S23934, U78785, X95309, Z99105, and Z99108; and, 

(d) a polynucleotide sequence comprising all or a fragment of (a), (b), or (c), 
wherein the fragment encodes a polypeptide comprising lipase activity, and wherein the 
polynucleotide subsequence is unique as compared to a sequence corresponding to any of 
GenBank accession numbers: 1I6WA, 1I6WB, A02813, A02815,A34992, AAA22574, 
AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617, AF134840, 
AF141874, AF237623, AJ297356, BAA11406, BAA22231, BAB05967, C69652, 
CAA00273, CAA00274, CAA02196, CAA64621, CAB12064, CAB12664. CAB51971, 
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CAB92662, CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, M74010, 
P37957, S23934, U78785, X95309, Z99105, and Z99108. 

82. An isolated or recombinant nucleic acid comprising a polynucleotide 
sequence encoding a polypeptide, the polypeptide comprising an amino acid sequence which 

5 is substantially identical over at least 45 contiguous amino acid residues.ofany one of SEQ 
ID NO: 55 to SEQ ID NO: 108, wherein the polynucleotide subsequence is unique as 
compared to a sequence corresponding to any of GenBank accession numbers: 1I6WA, 
1I6WB, A02813, A02815,A34992, AAA22574, AAB31769, AAC12257, AAD30278, 
AAF40217. AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, 
10 BAAl 1406. BAA22231, BAB05967, C69652, CAA00273. CAA00274, CAA02196, 

CAA64621. CAB12064, CAB 12664, CAB51971, CAB92662, CAB95850, D78508, E01340, 
E01903, E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, 
andZ99108. 

83. The isolated or recombinant nucleic acid of claim 82, wherein the 

15 polypeptide comprises an amino acid sequence which is substantially identical over at least 
45 contiguous amino acid residues of any one of SEQ ID NO: 55 to SEQ ID NO: 108, 
wherein the polynucleotide subsequence is unique as compared to a sequence corresponding 
to any of GenBank accession numbers: 1I6WA, 1I6WB, A02813, A02815A34992, 
AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617, 

20 AF134840, AF141874, AF237623, AJ297356, BAA11406, BAA22231, BAB05967, 
C69652, CAA00273, CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, 
CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, 
M74010, P37957, S23934, U78785, X95309, Z99105, and Z99108. 

84. ITie isolated or recombinant nucleic add of claim 82, wherein the 

25 polypeptide comprises an amino acid sequence which is substantially identical over at least 
180 contiguous amino acid residues of any one of SEQ ID NO: 55 to SEQ ID NO: 108, 
wherein the polynucleotide subsequence is unique as compared to a sequence corresponding 
to any of GenBank accession numb^s: 1I6WA, 1I6WB, A02813, A02815,A34992, 
AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617, 
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AF134840, AF141874, AF237623, AJ297356, BAA11406, BAA22231, BAB05967, 
C69652, CAA00273, CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, 
CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, 
M74010, P37957, S23934, U78785, X95309, Z99105, and Z99108. 

5 85. The isolated or recombinant nucleic add of claim 82, wherein the 

polypeptide conopnses an amino acid sequence which is substantially identical over at least 

212 contiguous amino acid residues of any one of SEQ ID NO: 55 to SEQ ID NO: 108, 
wherein the polynucleotide subsequence is unique as compared to a sequence corresponding 
to any of GenBank accession numbers: 1I6WA, 1I6WB, A02813, A02815A34992, 

10 AAA22574, AAB31769. AAC12257, AAD30278, AAF40217, AAF63229, AB000617, 
AF134840, AF141874, AF237623, AJ297356, BAA11406, BAA22231. BAB05967, 
C69652, CAA00273, GAA00274, CAA02196, CAA64621, CAB12064, CAB 12664, 
' CAB51971,CAB92662,GAB95850,D78508,E01340,E01903,E02083,E05047,JW0068, 
M74010, P37957, S23934, U78785, X95309, Z99105, and Z99108. 

15 96. The isolated or recombinant nucleic acid of claim 82, wherein the 

polypeptide comprises an amino acid sequence which is substantially identical over at least 

213 contiguous amino acid residues of any one of SEQ ID NO: 55 to SEQ ID NO: 108. 
wherein the polynucleotide subsequence is unique as compared to a sequence corresponding 
to any of GenBank accession numbers: 1I6WA, 1I6WB, A02813, A02815 ^34992, 

20 AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617, 
AF134840, AF141874, AP237623, AJ297356, BAA11406. BAA22231, BAB05967, 
C69652, CAA00273, CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, 
CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, 
M74010, P37957, S23934, U78785, X95309, Z99105, and Z99108. 

25 87. Hie isolated or recombinant nucleic add of claim 82, wherein the 

polypeptide comprises an amino add sequence which is substantially identical over at least 
215 contiguous amino add residues of any one of SEQ ID NO: 55 to SEQ ID NO: 108, 
wherein the polynucleotide subsequence is unique as compared to a sequence corresponding 
to any of GenBank accession numbers: 1I6WA, 1I6WB, A02813, A02815,A34992, 
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AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, AAF63229, AB000617, 
AF134840, AF141874, AF237623, AJ297356, BAA11406, BAA22231, BAB05967, 
C69652, CAA00273, CAA00274, CAA02196, CAA64621, CAB12064, CAB12664, 
CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, E02083, E05047, JW0068, 
5 M74010, P37957, S23934, U78785, X95309, Z99105, and Z99108. 

88. The nucleic acid of claim 8 1 or 82, wherein the encoded polypeptide 
exhibits lipase activity. 

89. TTie nucleic acid of claim 81 or 82, wherein the encoded polypeptide 
exhibits enantioselective lipase activity. 

10 90. The nucleic acid of claim 88, wherein the encoded polypeptide exhibits 

lipase activity with respect to tributyrin. 

91. Hie nucleic acid of claim 88, wherein the encoded polypeptide exhibits 
lipase activity with respect to tributyrin in DMF. 

92. Hie nucleic acid of claim 88, wherein the encoded polypeptide exhibits 
15 lipase activity with respect to tributyrin after heat treatment 

93. The nucleic acid of claim 89, wherein the encoded polypeptide exhibits 
enantioselective lipase activity with respect against neryl-butyrate. 

94. The nucleic acid of claim 89, wherein the encoded polypeptide exhibits 
enantioselective lipase activity with respect to geranyl-butyrate. 

20 95. The nucleic acid of claim 88, wherein the aicoded polypeptide exhibits 

lipase activity with respect to methyl esters. 

96. The nucleic acid of claim 88, wherein the encoded poly^tide exhibits 
lipase activity with respect to pentadecanolide. 
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91, Hie nucleic acid of claim 88, wherein the encoded polypeptide exhibits 
lipase activity with respect to oxacyclotridecan. 

98, An isolated or reconabinant nucleic acid comprising a polynucleotide 
sequence encoding a polypeptide comprising lipase activity produced by mutating or 

5 recombining one or more polynucleotide sequence of claim 81. 

99, The nucleic acid of claim 98, wherein the encoded polypeptide 
comprises enantioselective lipase activity. 

100- The nucleic acid of claim 81, 82, or 98, wherein the encoded polypeptide 
comprises a leader sequence. 

10 101. The nucleic acid of claim 81, 82, or 98, wherein the encoded polypeptide 

comprises a precursor peptide. 

102. The nucleic acid of claim 81, 82, or 98, wherein the encoded polypeptide 
comprises an epitope tag sequence. 

103. The nucleic acid of claim 81, 82, or 98, wherein the nucleic acid encodes 
15 a fusion protein, said nucleic acid comprising one or more additional nucleic acid sequences. 

104. A composition comprising two or more nucleic acids of claim 81, 82, or 

98. 

105. A composition of claim 104, wherein the composition comprises a 
library comprising at least about 2, 5, 10, 50, or more of the nucleic acids. 

20 106. A composition produced by cleaving of one or more nucleic acid of 

claim 81, 82, or 98. 

107. Tlie composition of claim 106, wherein the cleaving comprises 
mechanical, chemical or enzymatic cleavage. 
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108. Hie composition of claim 107, wherein the enzymatic cleavage 
comprises cleavage with a restriction endonuclease, an RNAse or a DNAse. 

109. A composition produced by a process comprising incubating one or more 
niicleic acids of claim 81, 82, or 98, in the presence of deoxyribonucleotide triphosphates and 

5 a nucleic acid polymerase. 

110. The composition of claim 109, wherein the nucleic acid polymerase is a 
thermostable polymerase. 

111. A cell comprising at least one nucleic acid of claim 81, 82, or 98, or a 
cleaved or amplified fi:agment or product thereof. 

10 112. The cell of claim 111, wherein the ceU expresses a polypeptide encoded 

by the nucleic acid. 

113. A vector comprising the nucleic acid of claim 81, 82^ or 98. 

114. The vector of claim 113, wherein the vector comprises a plasmid, a 
cosmid, a phage, a virus or a fragment of a virus. 

15 115. The vector of claim 113, wherein the vector comprises an expression 

vector. 

116. A cell transduced by the vector of claim 113. 

117. A composition comprising a polypeptide encoded by a nucleic acid 
selected from the nucleic acids of claim 81, 82, or 98, and an excipient. 

20 118. The composition of claim 1 17, wherein the excipient is a 

pharmaceutically acceptable excipient. 

119. A composition comprising a polypeptide of claim 81, 82, or 98, wherein 
the composition comprises a cleaning solution. 
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120. The composition of claim 119, wherein the cleaning solution further 
comprises one or more of: a builder, a surfactant, a polymer, a bleach system, a stnicturant, a 
pH adjuster, a humectant, or a neutral inorganic salt. 

121. A nucleic acid which comprises a unique subsequence in a nucleic acid 
selected from SEQ ID NO: 1 to SEQ ID NO: 54, wherein the unique subsequence is unique 
as compared to a nucleic acid sequence corresponding to any of GenBank accession 
numbers: 1I6WA, 1I6WB, A02813, A02815A34992, AAA22574, AAB31769, AAC12257, 
AAD30278, AAF40217, AAF63229, AB000617, AF134840, AF141874, AF237623, 
AJ297356, BAA11406, BAA22231, BAB05967, C69652, CAA00273, CAA00274, 
CAA02196, CAA64621, CAB12064, CAB12664, CAB51971, CAB92662, CAB95850, 
D78508, E01340, E01903, E02083, E05047, JW0068, M74010, P37957, S23934, U78785, 
X95309, Z99105, andZ99108. 

122. A target nucleic acid which hybridizes under stringent conditions to a 
unique coding oligonucleotide which encodes a unique subsequence in a polypeptide selected 

15 from SEQ ID NO: 55 to SEQ ID NO: 108, wherein the unique subsequence is unique as 

compared to an amino acid sequence or to a polypeptide encoded by a nucleic acid sequence 
corresponding to any of GenBank accession numbers: 1I6WA, 1I6WB, A02813, 
A02815,A34992, AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, 
AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, BAAl 1406, 

20 BAA22231, BAB05967, C69652, CAA00273, CAA00274, CAA02196. CAA64621, 

CAB12064, CAB12664, GAB51971, CAB92662, CAB95850, D78508, E01340, E01903, 
E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, and 
Z99108, 

123. The nucleic acid of claim 122, wherein the stringent conditions are 

25 selected such that a perfectly complementary oligonucleotide to the coding oligonucleotide 
hybridizes to the coding oligonucleotide with at least a 5x higher signal to noise ratio than for 
hybridization of the perfectly complementary oligonucleotide to a control nucleic acid 
corresponding to any of GenBank accession numbers: 1I6WA, 1I6WB, A02813, 
A02815 A34992, AAA22574, AAB31769, AAC12257, AAD30278, AAF40217, 
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AAF63229, AB000617, AF134840, AF141874, AF237623, AJ297356, BAA11406, 
BAA22231. BAB05967, C69652, CAA00273, CAA00274, CAA02196, CAA64621, 
CAB 12064, CAB12664, CAB51971, CAB92662, CAB95850, D78508, E01340, E01903, 
E02083, E05047, JW0068, M74010, P37957, S23934, U78785, X95309, Z99105, and 
Z99108, wherein the target nucleic acid hybridizes to the unique coding oligonucleotide with 
at least about a 2x higher signal to noise ratio as compared to hybridization of the control 
nucleic acid to the coding oligonucleotide. 

124. A database comprising one or more character strings corresponding to a 
polynucleotide sequence selected from SEQ ED NO: 1 to SEQ ID NO: 54 or a polypeptide 
sequence selected from SEQ ID NO: 55 to SEQ ID NO: 108. 

t 

125. The database of claim 124, wherein the one or more character strings is 
recorded in a computer readable medium. 

126. The database of claim 125, wherein the computer readable medium 
comprises a medium that resides internal or external to a computer. 

127. A method for manipulating a sequence record in a computer system, the 
method comprising: 

(a) reading a character string corresponding to a polynucleotide sequence 
selected from SEQ ID NO: 1 to SEQ ID NO: 54 or a polypeptide sequence selected from 
SEQ ID NO; 55 to SEQ ID NO: 108 or a subsequence thereof; 

(b) performing an operation on the character string; and, 

(c) returning a result of the operation. 

128. The method of claim 127, comprising reading a character string selected 

by a user. 

129. The method of claim 128, wherein the user selects the character string 
from a database or inputs the character string into the corr^uter system. 
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130. The method of claim 127, comprising transmitting the selected character 
string to an output device. 

131. The method of claim 127, comprising peifonning one or more operations 
selected firom among: a local sequence comparison, a sequence alignment, a sequence 
identity or similarity search, a structural sinodlarity search, a sequence identity or similarity 
determination, a structure determination, a nucleic acid motif determination, an amino acid 
motif determination, a hypothetical translation, a determination of a restriction map, a 
sequence recombination, or a BLAST determination, 

132. The method of claim 131, comprising aligning the selected character 
string with one or more additional character strings corresponding to a polynucleotide or 
polypeptide sequence. 

133. The method of claim 131, comprising translating one or more character 
strings selected from SEQ ID NO: 1 to SEQ ID NO: 54, into a character string corresponding 
to an amino acid sequence or translating a character string selected from SEQ ID NO; 55 to 
SEQ ID NO: 108, into a character string corresponding to a polynucleotide sequence. 

134. The method of claim 131, comprising determining sequence identity or 
similarity between the selected character string and one or more additional character strings, 
by evaluating codon usage. 

135. The method of claim 134, comprising determining optimal codon usage. 

136. The method of claim 127, comprising obtaining the result of the 
op^tion on a user output device. 

137. The method of claim 136, wherein the user output device is selected 
from among: a display monitor, a printer, and an audio-output. 

138. The method of claim 127, wherein the operation comprises transmitting 
the character string to a device capable of producing a physical embodiment of the character 
string. 
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139. The method of claim 138, comprising obtaining the result by obtaining a 
nucleic acid corresponding to the character string. 

140* The method of claim 138, comprising obtaining the result by obtaining a 
polypeptide or peptide corresponding to the character string or a sub-portion thereof. 

5 141. The method of claim 138, wherein the device coniprises an 

oligonucleotide synthesizer. 

142. The method of claim 138, wherein the device comprises a peptide 

synthesizer. 

143. A method of producing a modified or recombinant nucleic acid 
10 comprising mutating or recombining a nucleic acid of claim 81 or 82. 

144. A modified or recombinant nucleic acid produced by the method of 

claim 143. 

145. The method of claim 143, comprising recursively recombining the 
nucleic acid with one or more additional nucleic acids.- 

15 146. A modified or recombinant nucleic acid produced by the method of 

claim 145. 

147. The method of claim 145, wherein the one or more additional nucleic 
acids encode a polypeptide comprising lipase activity or an amino acid subsequence or 
fragment thereof . 

20 148. The method of claim 145, wherein the recursive recombination is 

performed in vitro. 

149. The method of claim 145, wherein the recursive recombination is 
performed in vivo. 
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150. The method of claim 145, wherein the recursive recombination produces 
at least one library of recombinant nucleic acids, which library comprises at least one 
polypeptide comprising lipase activity, or a homologue thereof. 

151. A nucleic acid library produced by the method of claim 150, 

152. A population of cells comprising the library of claim 151. 

153. The modified or recombinant nucleic acid produced by the method of 

claim 143. 

154. A cell comprising the nucleic acid of claim 153. 

155. A method of producing a polypeptide, the method comprising: (a) 
introducing a nucleic acid of claim 81, 82, or 98, into a population of cells, which nucleic 
acid is operably linked to a regulatory sequence capable of directing expression of a 
polypeptide encoded by the nucleic acid in at least a subset of the population of cells or 
progeny thereof; and, (b) expressing the polypeptide in at least the subset of the population of 
cells or progeny thereof. 

156. The method of claim 155, further comprising isolating the polypeptide 

from the ceUs. 

157. The method of claim 155, comprising expressing the polypeptide by 
culturing the population or subset of the population of cells in a nutrient medium under 
conditions in which the regulatory sequence directs expression of the polypeptide encoded by 
the nucleic acid. 

158. Tho method of claim 155, further comprising isolating or recovering the 
polypeptide from the cells or from the nutrient medium. 

159. The method of claim 155, wherein the culturing is performed in a bulk 
fermentation vessel. 
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160. The method of claim 155, wherein the cells are bacterial cells. 

161. The method of claim 155, wherein the ceUs are eukaryotic cells. 

162. The method of claim 161, wherem the cells are fungal cells, yeast cells, 
plant cells, insect cells, or mammalian cells. 

5 163. The method of claim 162, wherein the mammalian cells comprise 

fertilized oocytes, embryonic stem cells, or pluripotent stem cells, further comprising 
regenerating a transgenic mammal expressing the p<)lypeptide, and recovering the 
polypeptide form the transgenic manmial or a by-product of the transgenic animal. 

164. The method of claim 163, wherein the by-product is milk. 

10 165. A polypeptide produced by the method of claim 155. 

166. A method of hydrolyzing a lipid to therapeutically or prophylactically 
treat a gastrointestinal lipid related condition/disease/disprder, the method comprising: 
expressing in a target cell or contacting a target ceU with an effective amount of a 
polypeptide of claim 1 or 47. 

15 
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This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ black BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 
ja^REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: " ' 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 
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